How to get cloud analytics costs under control

Big data analytics can potentially cost big money.
But being reactive instead of proactive about optimizing your analytics in the cloud can, unfortunately, cost you even more money. This may sound obvious, but making the right decisions when configuring your cloud analytics platform and related processes now can save buckets of cash down the road.
Before you begin your journey to cloud analytics cost optimization, however, you must self-assess. You need to be honest with yourself and get a clear understanding of where you currently are, where you want to go and – maybe most importantly – what kind of costs you’re dealing with both now and in the future.
The data silo problem
If you’ve got a data silo problem – and it’s relatively
easy to recognize
if you do – then it’s time to face facts: your organization is spending way too much money on analysis that isn’t adding enough value. That’s because it’s almost certainly based on incomplete data sets.
In fact,
Database Trends and Applications
says poor data quality hurts productivity by up to 20 percent and prevents 40 percent of business initiatives from achieving targets. And a recent
Gartner survey
found that poor data quality costs businesses $15 million every year.
Not only that, but you’re also incurring a ton of hidden costs:
- Lost employees (and clients): Good employees hate dealing with bad data. They’ll eventually grow frustrated and leave. Bad data can also lead to wrong decisions and embarrassing client mishaps
- Lost time: The more time lost fumbling with incomplete data, the less effective your employees will be (and the more frustrated they’ll get). Not to mention the needless cost of all that wasted time
- Lost opportunities: Analysis based on flawed modeling is often worse than no analysis at all. With no central ownership, groups working with siloed data they believe to be complete is a recipe for disaster
- Large CapEx expenses associated with an on-prem system can take significant money away from other areas of the organization
- A monthly OpEx paid as a subscription fee is much easier on the corporate wallet, keeping organizations more nimble
- If performance or costs aren’t up to standard, cloud users can always cancel
- AVRO vs JSON: Instead of storing data as JSON files, it’s smart to institute a standard file conversion to AVRO files, which are more size-efficient
- Compression equals savings: Similarly, compressing all your data files as a matter of process helps keep storage costs down
- Consider cold storage: Cloud platforms like Azure and GCP offer cold storage options, such as Azure Cool Blob and Google’s Nearline and Coldline, which are less expensive options for storing large datasets and archived information: under some conditions, cold tier storage can equal savings of up to 50 per cent
- Evaluate data retention policies: In a perfect world, you’d keep all your data. But if you have so much that even keeping it in cold storage is cost prohibitive, you can always change your retention policies to delete very old raw data (you always have the option of keeping the aggregate data around, which takes up less storage space). Watch our video, Data hoarding in the age of machine learning to learn more.