Introduction to Database

Choosing the Right Database

We have a lot of managed databases on AWS to choose from
Questions to choose the right database based on your architecture:
Read-heavy, write-heavy, or balanced workload? Throughput needs? Will it change, does it need to scale or fluctuate during the day?
How much data to store and for how long? Will it grow? Average object size? How are they accessed?
Data durability? Source of truth for the data ?
Latency requirements? Concurrent users?
Data model? How will you query the data? Joins? Structured? Semi-Structured?
Strong schema? More flexibility? Reporting? Search? RDBMS / NoSQL?
License costs? Switch to Cloud Native DB such as Aurora?

RDBMS (= SQL / OLTP): RDS, Aurora – great for joins
NoSQL database – no joins, no SQL : DynamoDB (~JSON), ElastiCache (key / value pairs), Neptune (graphs), DocumentDB (for MongoDB), Keyspaces (for Apache Cassandra)
Object Store: S3 (for big objects) / Glacier (for backups / archives)
Data Warehouse (= SQL Analytics / BI): Redshift (OLAP), Athena, EMR
Search: OpenSearch (JSON) – free text, unstructured searches
Graphs: Amazon Neptune – displays relationships between data
Ledger: Amazon Quantum Ledger Database
Time series: Amazon Timestream
Note: some databases are being discussed in the Data & Analytics section