AWS Redshift cluster sizing
(7) dc1.large with 160Gb SSD @ $1380/year = 1120Gb @ $9660/year. (2) ds2.xlarge with 2Tb HHD @ $4295/year = 4000Gb @ $8590/year.
The dc1 instances in this case are around 4x as expensive per terabyte (though still quite the bargain as compared to hosting the cluster in-house) and give a 30-80% performance gain (depending on the benchmarks, ( example)). And while you can always add nodes to accommodate data growth, you can only add nodes of the same instance type which could potentially become quite expensive if you’re using instances of the small disk capacity dc1’s. Once your cluster is up, it is vital to monitor disk and CPU utilization so you know when to add new nodes. It is highly advisable to watch the graphs under the Performance tab in the Redshift Console as you add new load to the cluster. There are built in Cloudwatch alarms for disk usage, and these should be configured to alert above 70%. I like to know well in advance when it is getting there, so I regularly use Period= 5 minutes, Statistic = average, over 1 consecutive period, but since loads and vacuums can create usage surge spikes, you might want to configure the alert over more or longer periods. While Cloudwatch is great for this monitoring, it is convenient to also be able to compute capacity. There are several ways to query disk usage that render subtly different results, unfortunately none of which will yield the stats given by Cloudwatch. Here’s an example for a 6-node 12Tb cluster that currently shows disk space as 32% used on each node in the Console yet displays as 23%:select host ,sum(capacity)/3 as total ,sum(used)/3 as used ,sum(capacity)/3 - sum(used)/3 as free ,(((sum(used)/3)/(sum(capacity)/3.00)) * 100.00) as pct_used from STV_PARTITIONS group by host
| host | total | used | free | pct_used |
| 0 | 1904780 | 450450 | 1454330 | 23.65 |
| 1 | 1904780 | 449895 | 1454885 | 23.62 |
| 2 | 1904780 | 449776 | 1455004 | 23.61 |
| 3 | 1904780 | 450673 | 1454107 | 23.66 |
| 4 | 1904780 | 451483 | 1453297 | 23.7 |
| 5 | 1904780 | 447840 | 1456940 | 23.51 |
On this page
Share this
Share this
More resources
Learn more about Pythian by reading the following blogs and articles.
Azure Backup for SQL Server public preview
Azure Backup for SQL Server public preview
Jun 5, 2018 12:00:00 AM
5
min read
Options for Tracing Oracle dbms_stats
Options for Tracing Oracle dbms_stats
Oct 22, 2013 12:00:00 AM
9
min read
How to run DBSAT on RAC Instances
How to run DBSAT on RAC Instances
May 1, 2019 12:00:00 AM
4
min read
Ready to unlock value from your data?
With Pythian, you can accomplish your data transformation goals and more.