Quick notes : Amazon Relational Database Service (RDS)

- Fully managed AWS service.
- AWS supported RDS engines: MS SQL server, Oracle, Postgre SQL, Maria DB, AWS Aurora, mySQL
- Every DB instance will have a weekly maintenance window which you can specify while creating the DB instance. If you don’t, AWS will choose one randomly for you (30 min long)
- Limit : Upto 40 DB instances per account. 10 out of 40 can be Oracle or MS SQL server under license included model OR all 40 can be any DB engine under BYOL model.
- Maximum storage capacity : upto 4 TB for MS SQL, 6 TB for other RDS.
- RDS instance storage : EBS volumes, not instance store. General purpose for moderate i/o requirements and Provisional iops for high performance OLTP workloads. Magnetic RDS storage for small DB.
- You can only scale up the compute and storage capacity of an RDS instance, you cannot decrease the storage size ( scale down). Scaling storage can be done while instance is running, this can cause performance degradation during the change, but enhanced i/o thereafter. Scaling compute will cause downtime to your DB instance.
- MS SQL does not support storage scaling.
Multi — AZ option
- You can select multi-AZ option during instance launch. RDS creates a standby instance in a different AZ (in same region). Main purpose of Multi AZ is high availability and disaster recovery.
- Synchronous replication between Primary and standby.
- You cannot read/write to standby. You can’t choose the AZ for standby but can view it.
- Automatic failover to standby if primary fails. 1 to few mins to failover depending on the instance class.
- Recommended to use provisioned iops for multi AZ.
- Triggers for failover : Loss of primary AZ, Primary DB failure, loss of n/w connectivity to primary, compute/storage failure on primary, primary DB instance is changed, patching on primary instance.
- For manual failover, only option is : Reboot with failover option on primary.
- During failover, CNAME of RDS DB instance is updated to map to the standby ip address. Primary and standby have different ip’s but same CNAME/ endpoint. That’s why it is recommended to use endpoint and not ip address to reference the DB instance (to point to your multi AZ RDS instance). Very helpful during failover.
If you want to do any system maintenance, do changes on standby first, then promote standby to primary. So the initial primary becomes standby. Then do changes on the new standby. If you do a version upgrade though, it will happen on both primary and secondary causing an outage. But this will happen during the maintenance window unless you force it.
Back up
Automated Backups :
- Stored in S3.
- Multi AZ automated back ups will be taken from standby instance, not primary.
- For automated backups, DB instance must be in active state, if in any other state like Storage-Full state, backup will not happen.
- Used for Point-In-Time DB instance recovery.
- Back ups of instances and transaction logs happen daily, and can restore upto 5 mins in time.
- Enabled by default. To disable, set retention period to 0.
- No additional charge for back ups, but for S3 storage.
- Automated backups are deleted when you delete your db instance.
- Retention period : Default 7 days (console), 1 day (CLI). Exception: Aurora - 1 day regardless of how it is configured, max 35 days.
- Only supported for InnoDB storage engine for mySQL, not MySAM.
To share your automated backup, create a copy (manual snapshot). Manual snapshots can be shared, automated backups cannot be shared directly.
Manual Snapshots
- Cannot be used for Point-In-Time DB instance recovery.
- Stored in s3.
- Not deleted automatically, even if you delete RDS. They will stay in S3 till you manually delete them.
- Can be shared directly with other AWS accounts.
- When you restore from a DB snapshot, a new DB instance is created with a different endpoint. You can also change the storage type of the new instance.
RTO (Recovery Time Objective) : Time taken after disruption to restore. Ex: If disaster occurs at 12 pm n RTO is 8 hours, the DR process should restore the business by 8 pm.
RPO (Recovery Point Objective): Acceptable amount of data loss measured in time. Ex: If disaster occurs at 12 pm n RPO is 1 hour, the system should recover all data before 11 am so only 1 hour data loss.
Read Replicas (RR)
- Replica of Primary ( for multi AZ) or standalone RDS DB instance.
- Can be used only for reads.
- Useful for read-heavy applications where the read i/o capacity of RDS instance is reached, to scale the read i/o capacity of the RDS instance. Main purpose is scalability and performance.
- Asynchronous replication of primary to the read replica.
- For the primary to become a source for replication to the RR, it should have automatic backups enabled (retention period > 0)
- RR can be cross-AZ or cross-region.
- Limit for RR : 5 per DB instance.
- RR can be promoted to a standalone DB instances.
- You can specify the AZ for RR. You can change the storage type or instance class too (RR must have same or more of the storage and compute as source .. cannot scale it down)
You can have Read Replica of Read Replica, maximum chain of 4 instances. But there will be lot of lag due to asynchronous replication during each replication.
Primary → RR → RR → RR
- For multi AZ, if the primary fails, failover will happen and secondary will become primary, so RR will change the new primary (which was earlier standby) as their replication source.
- If primary is deleted, all RR will be promoted to a single AZ standalone DB instance, unless you delete them.
Encryption
- You cannot encrypt an existing unencrypted DB instance directly. You can create a new encrypted instance and migrate your data to it. Or you can restore from a backup/ snapshot into a new encrypted instance.
- RDS supports encryption at rest using AWS KMS.
- RDS supports SSL encryption for communication between App instances and RDS DB instances.