Data agility

Prasad Prabhakaran

#Thomas Kuhn: the man who changed the way the world looked at science;

Fifty years ago, a book by Thomas Kuhn altered the way we look at the philosophy behind science, as well as introducing the much abused phrase ‘paradigm shift’.
I strongly believe that we are reaching an inflection point of the so called ‘paradigm shift’.

The aim here is to separate agility from the many other buzzwords that flood the IT and business worlds and demonstrate the intimate link between agility, digital transformation, and enterprise success.

Agility describes how quickly an enterprise can respond to new opportunities and new threats. Do you want your business to be able to be steered like a cruise ship or like a speed boat, which can turn on a dime? It’s a choice!

Domain driven design, micro services and DevOps changed the way we develop software in the last decade. However, data in the analytics department did not keep pace. To speed up data-based decision making in a company with a modern development approach, analytics and software teams need to change.

What does a 21st-century data landscape look like? It’s decentralised and very different from what we see in most companies today.
A ‘domain’ aligned, small cross functional team (speed boat) needs a decentralised approach to data. Data must be considered a product by its generating team. They need to serve it. Analytics teams and software teams need to change!

(1) Domain teams must consider data as a product that they serve to everybody else, including analytics teams
(2) Analytics teams must build on that, stop hoarding data and instead pull it in on-demand
(3) Analytics teams must start to consider their data lakes/ data warehouses as data products as well

Let us clarify what we mean by domain; a sphere of knowledge or activity.

Every business line is a rich domain e.g.: customer domain, order domain etc. Domains generate a lot of data as a by-product. Many people in the organization need that data e.g. data engineering teams, marketing people, data scientists, management etc.

Today in most organisations, a central team of data engineers would supply all of the data via ETL tools or streaming solutions. They will have a central data lake or data warehouse and a BI front end to use for marketing and management. Data scientists might take data straight from the data lake which is probably the easiest way for them to access the data.

What problems do we see with this architecture?
a) This architecture creates a central bottleneck in the data engineering team
b) The domain knowledge is likely to be lost somewhere on the way through the central hub
c) Prioritisation of all the different heterogeneous requirements will be difficult
It’s like you have many speed boats waiting for fuel! The fuel supply is centralised. How can you expect business agility and enterprise success?

We don’t want to get stuck with the current data platforms: centralized and monolithic with highly coupled pipeline architecture – operated by silos of hyper-specialized data engineers. In effect doing ‘lipstick agile’.
Instead we need to shift to a paradigm that draws from modern distributed architecture: distributed data products that are oriented around domains and owned by independent cross-functional teams. The teams have embedded data engineers and data product owners, using common data infrastructure as a platform to host, prep and serve their data assets.
Data agility by making domains as the primary concern. Applying platform thinking to create a self-service data infrastructure and treating data as a product.

Below are some interesting and practical points of view for future reading/listening:

Holley Holland is working with a number of enterprises to help them launch their ‘speed boats’ aka ‘Business agility’ via new ways of ‘domain’ aligned product structures and governance, with the required decentralised ‘data strategies’.

So when are you launching your ‘speed boat’?