Scaling Your Data Science Team
In an earlier post, we explored hiring your first Data Scientist. To ensure this individual is successful, you must have engaged executives, defined expectations, and a culture that accepts change while working in new ways to make decisions. Once you hire your first Data Scientist, you can begin to think of scaling the team to deliver more projects in parallel while aligning with a greater number of functional areas across your business.
As your data science team scales, there are multiple considerations for how work is organized amongst employees and third parties, what investments will maximize the impact of the team and how it will maintain the ever-growing number of models and supporting data pipelines the team will build and deploy. These considerations become inputs to defining when you will need to hire a dedicated manager of data science and the requisite skill set they will need to be successful. This manager can be your first data science hire if that is the career path they choose to grow into, or it can be an outside hire that is experienced at scaling data science teams.
Building an Efficient Data Science Team
A Data Scientist can be effective by themselves, but they cannot be efficient. Efficiency comes from scaling different roles and skill sets that complement one another and create a cohesive workflow for the data science team. An effective data science team is a group of Data Scientists partnered with Data Engineers (DE) and Machine Learning Engineers (MLE). Each contributes to the collection, analysis and modeling of data by applying their unique skill set to a stage of the process. The ratio of these complimentary roles is important to maximizing the efficiency of different skill sets and maximizing the velocity of the team. The ratio will vary by organization and is mostly influenced by the number and complexity of data sources being analyzed and modeled.
Aligning Data Science with Business Functions
Engaged executives play a part in successfully executing findings by a data science team. I find that leveraging the idea of “jerseys”—assigning a Data Scientist and supporting DEs and MLEs to a specific functional area within the business including marketing, finance, product, or specific lines of business—provides the opportunity to build depth in parts of the business and develop strong bonds between data science and business teams. This alignment, having the name of a functional area philosophically on the jersey of the data science team, comes with expectations that this embedded relationship includes data science as part of organizational planning, staff meetings and other key discussion points to align on shared priorities and learnings.
Investments in Tooling, Automation, and Testing
Beyond the organizational alignment, effective and impactful data science teams require investment in tooling, automation, and testing. As organizations create more models and deploy them, they must manage ongoing changes to data inputs, features, third-party data products and interaction between models.
This tooling is often referred to as ML Ops, but includes many subcomponents, including model lifecycle management, model drift monitoring, bias detection, and integrity of training data. This automation ensures that repetitive tasks and those that are prone to human error are automated—freeing up teams for more impactful and creative work.
Leveraging External Expertise for Data Science Maturity
This journey to scale your data science team is most easily navigated with guides and those that have gone before you. Leveraging outside firms to build initial models, define and build tool stacks, and automate deployments will enable your team to focus on business relationships and improving operating models with your engaged executives.
While we do not recommend becoming dependent on third parties for your data science capabilities, they enable organizations to mature more quickly by leveraging known operating models and technology stacks to lower risk as the team grows and takes on greater business-impacting problems.
Not all of these activities must be done in parallel. However, making appropriate investments in lockstep across the areas of management, supporting skill sets, organizational alignment and automation tooling enables your data science team to maximize their effectiveness and efficiency.
Looking Ahead – Defining and Executing Your First Data Science Project
In our next post, we will explore how to go about defining and executing your first data science project. We will explore how to define a project charter, how to measure success, domains of maximum impact and working with cross-functional teams for the first time on this new approach to value creation.