The Top Ten Data, Cloud, & Analytic Trends | Official Pythian®® Blog

Written by Paul Lewis | Aug 22, 2022 4:00:00 AM

On any given day, everything can feel like a priority when you’re a CIO or CTO. We’re responsible for balancing cloud adoption with data governance and orchestration, understanding an organization’s unique data management needs, building out robust roadmaps, and much more. It’s our job to ensure our organization’s data is safe, secure, and accurate, all while providing the right data to the right people at the right time.

This still rings true for 2022. However, enterprises have widely recognized data’s transformative qualities on their key objectives—and they’re demanding even more from their technology teams.

With in-person events now back in full swing, I’ve had the opportunity to network with a diverse range of technology leaders. Independent of their size or industry, the same questions and concerns arose again and again. Some IT teams are expected to develop sustainability and cultural metrics, expertly navigate the Great Resignation, and manage increasingly complex roadmaps—and are expected to deliver more value with the same budget.

How do you navigate this post-pandemic world where competitors are scrambling to harness the power of data, cloud, and analytics? How will you keep up with the pace of change where the tools for organizational transformation are increasingly available?

Across two articles are the Top Ten topics and trends in cloud, data, and analytics on my mind for the near future. Each summarizes the challenges technology leaders are facing, my predictions on their impacts, and my take on how to approach them.

1. Consumers are taking back personal control over identifiable and private data

Today’s consumers are increasingly concerned about “surveillance capitalism”—the economic systems built on collecting, purchasing, and selling consumer data. They find the data scandals in the news tiresome; they’ve lost faith in the ability of important/respected organizations to effectively secure consumer data—especially when much of said data is personally identifiable.

In addition to market forces (the iPhone’s latest stance on privacy, for example) and government action (numerous privacy bills akin to GDPR), this mistrust is changing how private user data is collected, stored, and accessed.

That leads us to the question: who do consumers trust to be the trusted keepers of user credentials and authentication? Some data privacy critics believe that the blockchain’s promise of decentralization holds the answer.

Paul’s Take: Despite the blockchain hype, I don’t believe that you, as a potential vendor storing user data, will be adopting a federated solution anytime soon. Who do you call when platform issues persist? Who do you sue? To whom do you escalate your concerns? Instead, the keepers of user credentials will be the data custodians consumers already trust with their sensitive data: our governments and banks. Controlled by these trusted parties on a consumer level and implemented into legislation (think GDPR), consumers will temporarily provide personal data access to connect to various vendors for a set amount of time.

A real-world example is single sign-on (SSO) when you pay your taxes through the Canadian Revenue Agency (CRA) website. You use a trusted financial institution to verify your identity. In this same vein, consumers won’t be sharing their private data with vendors; instead, they’re using the bank as a centralized credential source—a go-between. This is one way I see consumers taking more control of who has access to their data and when.

2. Data mesh vs. data fabric: there will be a clear winner in governance

How often has someone asked you the difference between data fabric and data mesh? Usually, I refer to them as emerging approaches trying to solve the same challenge: the complex landscape of enterprise data management.

Some organizations have chosen to go the path of a data fabric. Data fabric (centralized governance model with federated analytics) sits on top of your existing technology and connects various datasets through artificial intelligence (AI) and metadata, enabling the creation of a knowledge graph providing insights across datasets. This hybrid centralized/decentralized model removes people from the equation as much as possible, focusing on AI insights and tight governance.

Others have opted for a data mesh. Data mesh (a federated governance model with federated analytics) is competing for adoption, which gives individuals control over their datasets within an organization, allowing them to share the data with other users without centralized processes or approvals. Distributed by design, data mesh aims to increase business agility by making humans more efficient and reducing governance rigidity.

Paul’s Take: If your enterprise values consistency and security within your organization—as well as your inter-organization communications—data fabric is the data governance winner. Think about a retail bank: why would it change how it governs data compared to the investment bank? Or the wealth management division? Or the insurance side? Centralized models take advantage of stricter data standards without the compliance audit headache one might experience with distributed governance.

I still believe federated governance models have their benefits. When you factor in budget, federated models can be attractive because you’re not sharing a platform expense with other organizations. And if you follow the principle of keeping data in its original location, federated models are a good fit. However, I feel they fall flat around consistency, security, compliance, and policy. This is especially true if data policies are only followed by part of an organization, and if compliance training is not rigorous and well implemented.

3. Citizen development is growing dramatically—but won’t replace code

By leveraging low-code and no-code (LCNC) solutions, citizen developers are creating transformative business applications. As data sharing across an organization’s functional groups becomes more commonplace, so will citizen development.

Today, these business technologists are successfully leveraging SQL and application development capabilities within SaaS platforms (i.e., ERPs) to actualize the promise of automation, simplifying business tasks that often require IT intervention. Further accelerating the citizen developer’s rise is that engineers’ data packages are now available through APIs and other pathways, improving data accessibility and consumption.

Paul’s Take: LCNC tools are having their moment—and deservedly so. Some believe software development can be hastened by a factor of 10 with LCNC. However, I don’t anticipate they will ever replace code. Developers and data engineers: you can breathe. There isn’t a target on your back.

Before mission-critical data gets into the hands of business users and data analysts, it must first be made usable to them. Code is required for data engineering, and is required for the complex transformations of dozens—if not hundreds—of sources that have no common ontology. Code is required for the implementation of complex mathematical algorithms that identify classifications or patterns within data—that help search for nuggets of insight gold.

Code just won’t go away. After all, business users aren’t—and won’t—be expected to code. Instead, power users will build their data presentation layer to power more significant analysis and decision-making, all without diverting IT resources.

I’ll be at the front of the line to point out that these tools have limitations. However, I believe citizen development is a huge step forward for data accessibility across organizations.

4. Data security is becoming a newfound mandate for the cybersecurity office

Cybersecurity offices have long focused their efforts on defending against network penetration. Security leaders routinely asked themselves, “How do we keep bad actors from entering our organization?”. The answer was usually stronger perimeter security (with better compliance and policy management for more coverage).

Enterprises have often thought that prioritizing earlier layers of security—perimeter, network, endpoint, and application security—reduced the need for securing data at the source. If enough resources were thrown at network penetration defenses, the data at the heart of their organization would be safe. Routine solutions include correlation and causation analyses, tighter security policies, rolling audits, and other measures managing outside threats.

Paul’s Take: Due to this stance, data and database security have received less attention. With more bad actors than ever before—and many getting proficient at what they do—we’re seeing more ransomware attacks targeting the beating heart of enterprises: their data.

I predict leading organizations will no longer ignore data security while also beginning to reconsider the relationship various roles have with their data. For example, if a database has personally identifiable information, should database administrators be able to access that data? Should they only have access to a database’s administrative functions instead of the data itself? You may soon find yourself asking these questions.

With more attention—and budget—allocated to cybersecurity, only time will tell how organizations manage security all the way down to the data level.

5. Cloud agnostic vs. cloud native is the new cloud-first vs. cloud-only debate

The debate between cloud-first and cloud-only is eternal. As a CIO, do you choose to deploy every workload to the cloud, or do you choose a cloud-first position—that you primarily deploy to the cloud but run others elsewhere?

To be honest, I feel this is the wrong argument altogether. Many enterprises moved on, taking a ‘cloud also’ stance—a philosophy I find more productive. The conversation is no longer about where you deploy workloads—because it’s fine to be in the cloud, software-as-a-service (SAAS), private cloud, or third party—but more about the services you leverage when you operate on the cloud.

Paul’s Take: Organizations and IT leaders must make a significant decision: choosing between a cloud’s native services or taking the agnostic approach and moving workloads between hyperscale clouds. Some CIOs are asking, “Should I use Snowflake or BigQuery on Google Cloud Platform?” not if they should be leveraging the power of the cloud for all of their critical workloads.

There are obvious benefits to moving between clouds and leveraging their strengths and capabilities. But as a technology buyer, do you want to be a small fish in a big pond (cloud agnostic) or a medium fish in a big pond (cloud native)? As an IT leader, I want to build a tight relationship with my cloud provider; I want those market development funds (MDF) and the support I otherwise wouldn’t receive if I simply used them for technology installation.

Also, as a personal preference, robust hyperscalers offer numerous services that help our clients achieve their goals. Being native on these platforms doesn’t restrict what other services our customers can use—a considerable upside.

Thanks for reading. Stay tuned for the last five technology trends for IT leaders, landing next week. And be sure to let me know what predictions I got right—or wrong—in the comments section below.

Make sure to sign up for updates so you don’t miss the next post.

View full post