Nothing Special   »   [go: up one dir, main page]

Getting Started With Amazon Documentdb

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Getting Started with Amazon

DocumentDB (with MongoDB


Compatibility)

May 2019
Notices
Customers are responsible for making their own independent assessment of the
information in this document. This document: (a) is for informational purposes only, (b)
represents AWS current product offerings and practices, which are subject to change
without notice, and (c) does not create any commitments or assurances from AWS and
its affiliates, suppliers or licensors. AWS products or services are provided “as is”
without warranties, representations, or conditions of any kind, whether express or
implied. AWS responsibilities and liabilities to its customers are controlled by AWS
agreements, and this document is not part of, nor does it modify, any agreement
between AWS and its customers.

© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved.
Contents
Introduction ..........................................................................................................................1
Key Features of Amazon DocumentDB ..............................................................................1
AWS Regions and Availability Zones .................................................................................2
Limitations of Traditional Architectures ...............................................................................4
Amazon DocumentDB: Cloud Native Architecture .............................................................5
Amazon DocumentDB Architecture ....................................................................................7
High Availability ................................................................................................................8
High Performance ..........................................................................................................10
Scalability .......................................................................................................................12
Automatic scaling storage ..............................................................................................14
Security and Compliance ..................................................................................................14
AWS IAM ........................................................................................................................14
Network Security ............................................................................................................14
Encryption .......................................................................................................................15
User Management..........................................................................................................15
Auditing Events ..............................................................................................................15
Compliance.....................................................................................................................15
Backup and Restore ..........................................................................................................16
Managing Amazon DocumentDB......................................................................................16
Monitoring ..........................................................................................................................16
Migrating to Amazon DocumentDB...................................................................................17
Offline Migration .............................................................................................................17
Online Migration .............................................................................................................18
Hybrid Approach.............................................................................................................19
Connecting to Amazon DocumentDB ...............................................................................20
Replica Set Mode ...........................................................................................................21
Cluster Endpoint .............................................................................................................22
Reader Endpoint ............................................................................................................22
Instance Endpoint...........................................................................................................22
Conclusion .........................................................................................................................22
Contributors .......................................................................................................................23
Abstract
Amazon DocumentDB (with MongoDB compatibility) is a fast, scalable, highly available,
and fully managed document database service that supports MongoDB workloads. This
paper covers the architecture and key features of Amazon DocumentDB, and helps you
understand how you can use Amazon DocumentDB to run large, mission critical
MongoDB workloads. This whitepaper also covers Amazon DocumentDB security,
scalability, performance, and approaches to migrate to Amazon DocumentDB.

The target audience of this whitepaper is solutions architects, database administrators,


and developers with a basic understanding of cloud computing, AWS, and MongoDB.
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Introduction
When developing modern applications, document databases like MongoDB are a
popular choice for storing semi-structured data for use cases like product catalogs, user
profiles, mobile applications, and content management. These databases can grow to
multiple terabytes in size and may need to scale to millions of reads per second. Setting
up and managing large, highly available, high-performance MongoDB databases on
your own can be complex and challenging.

Amazon DocumentDB is a fully managed database service that is MongoDB


compatible. Amazon DocumentDB is designed to give you the performance, scalability,
and availability you need when operating mission-critical MongoDB workloads. You can
use the same MongoDB application code, drivers, and tools you do today to run and
manage your workloads on Amazon DocumentDB.

Key Features of Amazon DocumentDB


Fully Managed Service
AWS takes care of the hardware provisioning, patching, backups, high availability, and
durability. This frees you from time-consuming administration tasks and lets you focus
on building your applications.

Compatible with MongoDB


Amazon DocumentDB is MongoDB-compatible and implements the MongoDB 3.6 API.
It emulates the responses a MongoDB client expects from a MongoDB server. This
means that you can continue to use your existing MongoDB drivers and tools with
Amazon DocumentDB with little or no change. Updating your application to use Amazon
DocumentDB could be as simple as redirecting the application to the Amazon
DocumentDB endpoint after migrating your data.

Highly Scalable
In Amazon DocumentDB, storage and compute are decoupled, and can be scaled
independently. You can start with a cluster containing one instance, and add up to 15
read replicas to support millions of reads per second. You do not have to provision
storage in advance—Amazon DocumentDB automatically scales provisioned storage up
to 64 TB as your data grows.

Page 1
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Fault Tolerant
Amazon DocumentDB is highly durable. Your data is replicated six ways across three
Availability Zones. Amazon DocumentDB transparently handles the loss of up to two out
of six data copies without losing write availability, or three out of six copies without
losing read availability.

High Performance
Amazon DocumentDB uses an all SSD, log-structured storage engine that is purpose-
built for database workloads. Amazon DocumentDB delivers twice the throughput of
currently available managed MongoDB services.

Automatic, Continuous, Incremental Backups and Point-in-time Recovery


Amazon DocumentDB's backup capability enables point-in-time recovery for your
clusters. You can restore your data to a new cluster to any specified second within your
backup retention period, up until the last five minutes. Your automatic backup retention
period can be configured up to thirty-five days. Automated backups are stored
in Amazon S3, which is designed for 99.999999999% durability. Amazon DocumentDB
backups are automatic, incremental, and continuous and have no impact on cluster
performance.

Near Instant Crash Recovery


Amazon DocumentDB uses log-structured storage and does not require crash recovery
replay of database redo logs, greatly reducing restart times (typically less than 60
seconds).

Highly Secure
Amazon DocumentDB runs in your Amazon Virtual Private Cloud (Amazon VPC), and
encrypts connections using Transport Layer Security (TLS) to secure data in transit.
Amazon DocumentDB also enables encryption of data at rest in the default
configuration.

AWS Regions and Availability Zones


To understand the architecture of Amazon DocumentDB and how it helps you design
highly available applications on AWS, you must be familiar with AWS global
infrastructure.

Page 2
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

The AWS Global Infrastructure comprises AWS Regions and Availability Zones. AWS
Regions are separate geographic areas. AWS Regions consist of multiple, physically
separated and isolated Availability Zones that are connected with low latency, high
throughput, highly redundant networking. Availability Zones consist of one or more
discrete data centers, each with redundant power, networking, and connectivity, and
housed in separate facilities (Figure 1).

Figure 1: AWS Regions and Availability Zones

These Availability Zones enable you to operate production applications and databases
that are more highly available, fault tolerant, and scalable than possible when using a
single data center. You can deploy your applications and databases across multiple
Availability Zones. In the unlikely event of a failure of one Availability Zone, user
requests are routed to your application instances in the second Availability Zone. This
approach ensures that your application continues to remain available at all times.

Page 3
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Limitations of Traditional Architectures

Figure 2: Architecture of traditional databases

Traditional databases have monolithic architectures—the compute and storage layers


are tightly coupled and cannot be scaled independently. Scalability is handled by adding
more nodes, each with its own compute and storage. Adding an extra node for scaling
or replacing a failed node requires that you copy or replicate the existing data to the
new node; this process can take hours, days, or even weeks for large databases.

Figure 3: Copying data to a newly added node on traditional database (time consuming)

Page 4
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Amazon DocumentDB: Cloud Native


Architecture
Amazon DocumentDB is designed for the cloud and to avoid the limitations of traditional
database architectures.

Decoupled Compute and Storage


The compute and storage layers are decoupled in Amazon DocumentDB, and can be
scaled independently. The primary instance and replicas share the same cluster
volume. Adding a read replica or replacing a failed instance does not require copying
any data, and can be performed in a few minutes regardless of the size of your data.

Figure 4: Amazon DocumentDB: Decoupled compute and storage

Fault-Tolerant Design
In Amazon DocumentDB, the durability is handled at the storage layer. Whether your
cluster contains a single instance or 16 instances, you have the same level of durability
for your data.

Amazon DocumentDB divides its database volume into 10-GB segments, each
distributed across the cluster, thus isolating the blast radius of disk failures. Each
segment is replicated six ways across three Availability Zones.

Page 5
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Figure 5: Data replicated six ways across three Availability Zones

Amazon DocumentDB storage is also self-healing; data blocks and disks are
continuously scanned for errors and replaced automatically. Amazon DocumentDB
monitors disks and storage nodes for failures and automatically replaces or repairs the
disks and storage nodes without the need to interrupt read or write processing from the
database.

Low-Latency Read Replicas


You can create up to 15 Amazon DocumentDB replicas across multiple Availability
Zones to scale your read traffic. Amazon DocumentDB replicas share the same
underlying storage as the source instance, avoiding the need to copy data to replicas to
keep them in sync. This approach frees up more processing power to serve read
requests and reduces the replica lag time—typically under 100 milliseconds. As the
primary instance and replicas share the same storage, adding a replica does not require
any data to be copied. You can add a replica within minutes regardless of the size of
your data.

Page 6
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Figure 6: Add up to 15 replicas in minutes regardless of size of data

Amazon DocumentDB Architecture


The following diagram shows the main components of Amazon DocumentDB.

Figure 7: Amazon DocumentDB architecture

• Cluster: A cluster consists of one or more instances that provide the compute,
and a cluster volume that manages the data for the instances. A cluster can
have up to 16 instances (a primary and up to 15 read replicas). Cluster
instances need not be all of the same instance size.

Page 7
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

• Primary instance: An instance that supports read/write workloads and performs


all the data modifications to the cluster volume. Each Amazon DocumentDB
cluster has one primary instance.

• Cluster volume: The cluster volume provides SSD-backed storage for your
database. The primary instance and any Amazon DocumentDB replicas share
the same cluster volume.

• Replicas: An Amazon DocumentDB replica supports only read operations, and


each DB cluster can have up to 15 Amazon DocumentDB replicas. In case the
primary instance fails, one of the Amazon DocumentDB replicas is promoted as
the primary.

High Availability
Amazon DocumentDB has a number of features that make it highly available.

Highly Available Distributed Storage


Amazon DocumentDB replicates your data six ways across three Availability Zones.
The cluster volume spans three Availability Zones in a single AWS Region, and each
Availability Zone contains two copies of the cluster volume data. This functionality
means that Amazon DocumentDB can transparently handle the loss of up to two data
copies or an Availability Zone failure without losing write availability, or the loss of up to
three data copies without losing read availability.

Figure 8: Amazon DocumentDB tolerates loss of three copies of data

Page 8
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Near Instant Crash Recovery


Amazon DocumentDB is designed to recover from a crash almost instantaneously.
Unlike other databases, Amazon DocumentDB does not need to replay redo logs after a
crash before making the database available for operations. The storage is organized in
many small segments and Amazon DocumentDB can perform crash recovery
asynchronously on parallel threads. This approach reduces database restart times to
less than 60 seconds in most cases.

Automatic failover without data loss


Amazon DocumentDB uses automated checks to detect failure of the primary instance
in a cluster. If the primary instance fails, Amazon DocumentDB automatically fails over
to any of up to 15 Amazon DocumentDB replicas with minimal downtime and availability
impact on applications. For higher availability, we recommend placing at least one
replica in a different Availability Zone from the primary instance. Failover happens with
no data loss, and redo log replay is not required, because the replicas and the primary
instance share the same storage.

Figure 9: Any of 15 replicas can be promoted as the primary without data loss

The following table gives guidelines on configurations for meeting different availability
goals for your Amazon DocumentDB database.

Page 9
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Table 1: Typical Deployment Configurations

Availability Goal Total Instances Replicas Availability Zones

99% 1 0 1

99.9% 2 1 2

99.99% 3 2 3

Failover Tiers
Each Amazon DocumentDB replica instance is associated with a failover tier (0–15).
When a failover occurs due to maintenance or an unlikely hardware failure, the primary
instance fails over to a replica with the lowest numbered priority tier. If multiple replicas
have the same priority tier, the primary fails over to that tier's replica that is the closest
in size to the primary.

By setting the failover tier for a group of select replicas to 0 (the highest priority), you
can ensure that a failover promotes one of the replicas in that group. Further, you can
effectively prevent specific replicas from being promoted to primary if there is a failover
by assigning a low-priority tier (high number) to these replicas. This is useful in cases
where specific replicas are receiving heavy use by an application and failing over to one
of them would negatively affect a critical application.

High Performance
Amazon DocumentDB scales to millions of requests per second with millisecond
latencies, and achieves twice the throughput of currently available MongoDB managed
services. It uses a number of optimizations to achieve this.

Log Structured Storage


The database engine is tightly integrated with an SSD-based, virtualized storage layer
purpose-built for database workloads, reducing write operations to the storage system.
Tasks related to replication, log processing, and backups are offloaded to the storage
layer, reducing the load on compute instances.

Unlike traditional databases, where the compute node must periodically checkpoint data
and flush dirty blocks from buffers to disk, in Amazon DocumentDB only the write-ahead
log records are written to storage. This reduces unnecessary communication between
the compute and storage, enabling more efficient use of network I/O.

Page 10
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Quorum-based Reads and Writes


I/O operations use distributed systems techniques such as quorums to improve
performance consistency and tolerance to outliers. Data write operations are
acknowledged as soon as they are committed by four out of six storage nodes, and
individual storage nodes acknowledge the write operations as soon as the log records
are persisted to disk. A slow or failed storage node does not impact database
performance or availability due to the use of the quorum model.

Survivable Caches
In Amazon DocumentDB, the database buffer cache has been moved out of the
database process. If a database restarts, the cache remains warm, and performance is
not impacted due to a cold cache, as is the case with traditional databases. This
approach lets you resume fully loaded operations much faster.

Figure 10: Cache is separate from database and survives database restart

Automatic, Continuous Backups


Amazon DocumentDB continuously backs up data to Amazon S3, which is designed for
99.999999999% durability. Amazon DocumentDB backups are automatic, incremental,
and continuous, and have no impact on database performance, as the backup is
offloaded to the storage layer. By default, backup and the ability to perform a point-in-
time restore is enabled on all Amazon DocumentDB clusters.

Page 11
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Figure 11: Backups are offloaded to the storage layer and do not impact performance

Amazon DocumentDB’s backup capability enables point-in-time recovery for your


instance. This functionality allows you to restore your database to any second during
your retention period (up to the last 5 minutes) with only a few clicks.

Scalability
Amazon DocumentDB is designed to be highly scalable. Amazon DocumentDB
supports both vertical and horizontal scaling. You can scale vertically by increasing the
size of your instances. You can scale horizontally by adding up to 15 read replicas,
supporting millions of requests per second. The primary instance and read replicas
share the same storage, and read replicas can be added in a few minutes with minimal
impact on database availability. Amazon DocumentDB can automatically scale your
storage up to 64 TB as your data grows and you only pay for the storage that you use.

Scaling Up
Amazon DocumentDB instances are available in various sizes, starting from the
db.r5.large instance with 2 vCPUs and 16-GiB RAM, to the db.r5.24xlarge instance with
96 vCPUs and 768-GiB RAM. The complete list of Amazon DocumentDB instance types
and regional availability can be found on the Amazon DocumentDB pricing page.

You choose an appropriate instance type based on the RAM, vCPU, and network
throughput required. You can start with a smaller instance type like db.r5.large or

Page 12
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

db.r5.xlarge, and scale up to a larger instance type as your application grows. Compute
scaling operations typically complete in a few minutes irrespective of the size of your
data. Scaling does not require any copying of data because the storage and compute
layers are decoupled in Amazon DocumentDB. Scaling up is useful if you want to scale
your write capacity or to provision a larger read replica instance for running read-only
analytics workloads.

Figure 12: Scale up or scale down in minutes without moving any data

Scaling Out
You can scale out your cluster by adding read replicas. You can add up to 15 read
replicas and scale your read capacity to millions of requests per second. The replica lag
is low (usually less than 100 milliseconds) because the read replicas and the primary
instance share the same storage volume. You can add replicas in minutes without any
downtime or impact to database performance.

Figure 13: Add up to 15 replicas in minutes without downtime

Page 13
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Automatic scaling storage


With Amazon DocumentDB, unlike traditional databases, you do not have to provision
storage space explicitly while creating the database. Amazon DocumentDB data is
stored in an SSD-backed virtual volume (cluster volume) that automatically grows as the
amount of data in the database increases. The volume grows in increments of 10 GB up
to a maximum of 64 TB. This process is transparent to your application, without any
impact on application availability or performance.

Security and Compliance


With Amazon DocumentDB, best practices are the default. Authentication, encryption-
at-rest, and encryption-in-transit are enabled by default. You can control access to
Amazon DocumentDB management operations, such as creating and modifying
clusters, instances, and more, using AWS IAM users, roles, and policies. You can
authenticate users to an Amazon DocumentDB database via standard MongoDB tools
and drivers.

AWS IAM
Amazon DocumentDB is integrated with AWS Identity and Access Management (IAM)
and provides you the ability to control the actions that your AWS IAM users and groups
can take on specific Amazon DocumentDB resources, including clusters, instances, and
snapshots. In addition, you can enable resource-level permissions by tagging your
Amazon DocumentDB resources, and configuring IAM rules based on the tags.

Network Security
Amazon DocumentDB clusters are VPC-only and are created directly in your VPC.
Amazon VPC lets you provision a logically isolated section of the Amazon Web
Services (AWS) cloud where you can launch AWS resources in a virtual network that
you define. Amazon VPC enables you to isolate your cluster in your own virtual network
and connect to your on-premises IT infrastructure using industry-standard encrypted
IPsec VPNs. You have complete control over your virtual networking environment,
including selection of your own IP address range, creation of subnets, and configuration
of route tables and network gateways. You can leverage multiple layers of security,
including security groups and network access control lists (ACLs), to help control
access in each subnet. This approach gives you complete control over who can access
your Amazon DocumentDB database.

Page 14
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Encryption
Amazon DocumentDB supports TLS to encrypt connections from applications to secure
data in transit. Amazon DocumentDB also supports encryption of data at rest using
AES-256. Encryption is applied cluster wide and all of the data is encrypted, including
the cluster data, indexes, snapshots, logs, and automated backups. Encryption keys are
managed by AWS Key Management Service (AWS KMS), which is a highly available,
durable, and secure solution for managing sensitive encryption keys. With AWS KMS,
you can either use the service-managed key or create your own encryptions keys.

User Management
You can connect to Amazon DocumentDB using standard MongoDB tools and drivers.
Amazon DocumentDB supports authentication using the Salted Challenge Response
Authentication Mechanism (SCRAM), which is the default authentication mechanism
with MongoDB.

When you create an Amazon DocumentDB cluster, you specify a master user name
and password. The master user has administrative permissions for the cluster. You can
connect as the master user to Amazon DocumentDB and create additional users as
required using db.createUser.

Auditing Events
Amazon DocumentDB supports auditing of the operations performed on your cluster.
Once auditing is enabled, Amazon DocumentDB tracks authentication, Data Definition
Language (DDL), and user management events. For example, with the auditing feature,
you can track failed login attempts, or DDL operations like the creation of collections or
indexes. These audit records are exported as JSON documents to Amazon CloudWatch
Logs for you to analyze and monitor.

Compliance
Amazon DocumentDB is designed to meet the highest security standards and to make it
easy for you to verify our security and meet your own regulatory and compliance
obligations. Amazon DocumentDB has been assessed to comply with PCI DSS, ISO
9001, 27001, 27017, and 27018, SOC 2, in addition to being HIPAA eligible.

Page 15
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Backup and Restore


Amazon DocumentDB backs up your cluster volume automatically and retains backup
data for the length of the backup retention period (between 1 and 35 days). Amazon
DocumentDB’s backup capability enables point-in-time recovery of your cluster to any
second during your retention period, up to the last 5 minutes, with just a few clicks.

If you want to retain a backup beyond the maximum retention period, you can take a
snapshot of the cluster. DB snapshots are user-initiated backups of your database that
are kept until you explicitly delete them.

Backups are stored in Amazon S3, which is designed for 99.999999999% durability.
Backups are automatic, incremental, and continuous. Backups have no impact on
database availability or performance because the backups are offloaded to the storage
layer.

To restore your data, you can create a new cluster quickly from the backup Amazon
DocumentDB maintains, or from a cluster snapshot.

Managing Amazon DocumentDB


You can access and manage your Amazon DocumentDB cluster in several ways. When
you are getting started with Amazon DocumentDB, the simplest way is to use the AWS
Management Console.

In addition to using the AWS Management Console, you can manage Amazon
DocumentDB using the AWS Command Line Interface (AWS CLI), or you can
programmatically interact with and manage your Amazon DocumentDB cluster using
the AWS SDKs and libraries. AWS SDKs and libraries are available for many popular
languages like Java, PHP, Python, Ruby, and .NET.

Monitoring
You can monitor Amazon DocumentDB using several methods. You can monitor the
health and status of your Amazon DocumentDB cluster and your Amazon DocumentDB
instances using the AWS Console or the AWS CLI. Amazon DocumentDB integrates
with Amazon CloudWatch and you can monitor performance metrics like CPU
utilization, memory, IOPS, and network throughput using Amazon CloudWatch.

Page 16
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Figure 14: Amazon DocumentDB CloudWatch metrics

Amazon DocumentDB tracks the events related to your cluster. You can view the
history of the events including details on snapshot creation, failover, instance reboots,
and any modifications to your cluster. You can use the AWS Console or the AWS CLI
(describe-events command) to view these event details.

Migrating to Amazon DocumentDB


You can migrate your data from any MongoDB database, either on-premises or in the
cloud (e.g. a MongoDB database running on Amazon EC2), to Amazon DocumentDB.
You can migrate your data from the source MongoDB database to Amazon
DocumentDB using a number of approaches.

Offline Migration
The simplest approach is to do an offline migration. Because Amazon DocumentDB is
compatible with the MongoDB API, you can use the mongodump tool to export the data
from MongoDB, and the mongorestore tool to restore the data into Amazon
DocumentDB. The offline migration method results in downtime while your dump and
restore operations are running. This method is suitable for migration of non-production
workloads or for migration of non-critical databases where you can afford the downtime.

Page 17
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Figure 15: Offline migration approach

Online Migration
For migration of production workloads with minimal downtime, you can use the online
approach or the hybrid approach. With the online migration approach, you use AWS
Database Migration Service (DMS) to migrate the data from MongoDB to Amazon
DocumentDB. DMS performs an initial full load of the data from the MongoDB source to
Amazon DocumentDB. During the full load, you source database is available for
operations. Once the full load is completed, DMS switches to change data capture
(CDC) mode to keep the source (MongoDB) and destination (Amazon DocumentDB) in
sync. Once the databases are in sync, you can switch your applications to point to
Amazon DocumentDB with near zero downtime.

See the AWS Database Migration Service documentation for more information on
migrating from MongoDB to Amazon DocumentDB.

Page 18
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Figure 16: Online migration approach using DMS

Hybrid Approach
The hybrid approach is a combination of the offline and online migration approaches.
The hybrid approach is useful in a scenario where you need minimal downtime during
migration, but the size of the source database is large or sufficient bandwidth is not
available to migrate the data in a reasonable amount of time.

The hybrid approach has two phases. In the first phase, you export the data from the
source MongoDB using the mongodump tool, transfer it to AWS (if the source is on-
premises), and restore it to Amazon DocumentDB. You can use AWS Direct
Connect or AWS Snowball to transfer the export dump to AWS. During this phase, the
source (MongoDB) is available for operations and the data restored to Amazon
DocumentDB does not contain the latest changes.

Page 19
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

In the second phase, you use DMS in CDC mode to copy the changes from the source
(MongoDB) to Amazon DocumentDB and keep them in sync. Once the databases are in
sync, you can switch your applications to point to Amazon DocumentDB with near zero
downtime.

Figure 17: Hybrid migration approach using DMS

Although write operations with existing indexes can be parallelized, foreground and
background index builds are single-threaded. Regardless of the approach, pre-creating
indexes in your Amazon DocumentDB cluster before importing your data usually results
in a faster migration time.

Connecting to Amazon DocumentDB


Amazon DocumentDB is compatible with the MongoDB 3.6 API. With Amazon
DocumentDB, you can run the same application code and use the same drivers and
tools that you use with MongoDB. For example, you can use the mongo shell to connect

Page 20
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

to Amazon DocumentDB and perform operations like creating and editing collections
and documents. You connect to Amazon DocumentDB in replica set mode
(recommended) or by using the endpoints for your cluster. There are three types of
endpoints for Amazon DocumentDB—the cluster endpoint, the reader endpoint, and the
instance endpoint.

Figure 18: Connect to Amazon DocumentDB with your existing tools via the endpoints

Replica Set Mode


When you connect in replica set mode, your Amazon DocumentDB cluster appears to
your drivers and clients as a replica set. Connecting to the cluster endpoint in replica set
mode is the recommended approach for general use. Replica set mode is
advantageous for high availability and effectively balancing client requests in your
cluster. Instances added and removed from your Amazon DocumentDB cluster are
reflected automatically in the replica set configuration. You can connect to your Amazon
DocumentDB cluster endpoint in replica set mode by specifying the replica set
name rs0. Connecting in replica set mode enables your database client to specify Read
Concern, Write Concern, and Read Preference options.

Page 21
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Cluster Endpoint
The cluster endpoint connects to your cluster’s current primary instance. The cluster
endpoint can be used for read and write operations. The cluster endpoint provides
failover support. If your cluster’s current primary instance fails, the cluster endpoint
automatically redirects connection requests to a new primary instance. You do not have
to make changes to your application after a failover.

Reader Endpoint
The reader endpoint load balances read-only connections across all available replicas
in your cluster including the primary instance. When you add a replica instance to your
Amazon DocumentDB cluster, it is made available for load balancing read connections
using the reader endpoint. This means that you do not have to make any application
changes while adding or removing read replicas in your cluster.

Instance Endpoint
You can also connect to any instance in your cluster using the instance endpoint. The
recommended way to connect to your cluster is to use the cluster endpoint for
read/write operations and the reader endpoint for read operations. However, there may
be scenarios where you create a larger than normal read replica for running analytic
workloads. You can use the instance endpoint to connect and run those analytical
queries against the larger instance without affecting other instances in the cluster.

See the Amazon DocumentDB documentation for step-by-step instructions on creating


an Amazon DocumentDB cluster and connecting to it.

Conclusion
Amazon DocumentDB is a secure, highly available, MongoDB-compatible database that
is purpose-built for the cloud. It can scale to millions of requests per second and run
highly scalable mission critical MongoDB workloads.

Amazon DocumentDB is a fully managed service. You do not need to worry about
database management tasks, such as hardware provisioning, patching, setup,
configuration, or backups. This frees you from time-consuming administration tasks and
lets you focus on building your applications.

Page 22
Amazon Web Services Getting Started with Amazon DocumentDB (with MongoDB Compatibility)

Contributors
Contributors to this document include:

• Ashok Sundaram, Solutions Architect, Amazon Web Services

Page 23

You might also like