How to do data science without big data

Too often, IT managers put their data science initiatives on hold until they can build a solid layer of data engineering. They wait until a data warehouse is available before planning data analytics projects, assuming that advanced analytics is critical to transformational business value and large volumes of neatly organized data is a prerequisite.

Nothing is further from the truth.

Here are four things to keep in mind if you don’t have big data but want to pursue data science initiatives.

[ How can automation free up more staff time for innovation? Get the free eBook: Managing IT with Automation. ]

1. Business issues should determine what kind of analysis you need

About 80% of data science projects fail to deliver business results, Gartner estimates. One of the main reasons for this is that leaders don’t choose the right business problems to solve. Most data analysis projects are chosen based on available data, available skills, or available tool sets. These are recipes for failure; a data analysis project should never start with data or analysis.

The best way to start the data science journey is to do some soul searching on organizational strategy. Discover the most important issues your target users want to solve and see if their resolution will have the desired business impact. The business challenges you choose will dictate which analytical approach you take and therefore the data you need.

Not having any data to start can even be an advantage: when you start with a clean slate, you aren’t cluttered with old baggage. On the other hand, organizations with a much longer footprint often face costly digital transformations.

Take the example of Moderna, which has created a digitally-driven culture since its inception in 2010. It has built a data and analytics platform to serve its business priorities focused on drug-based drug development. ‘MRNA. This focused approach allowed Moderna to create the COVID-19 vaccine master plan in just two days.

2. Your analytical approach dictates the data you source

Organizations can spend months building data warehouses only to find that the data they’ve collected isn’t good enough to perform the analysis they need. Machine learning algorithms often require data of a particular type, volume, or granularity. Trying to create a perfect data engineering layer without specifying how it will be used is a wasted effort.

When you have visibility into organizational strategy and the business issues to be resolved, the next step is to finalize your analytical approach. Find out if you need descriptive, diagnostic or predictive analyzes and how the information will be used. This will clarify the data you need to collect. If finding data is a challenge, phase out the collection process to allow iterative progress of the analytics solution.

For example, the executives of a large computer manufacturer we worked with wanted to understand what drives customer satisfaction. So they implemented a customer experience analytics program that started with direct customer feedback via voice of the customer surveys. Descriptive information presented in the form of data stories helped improve promoters’ net scores in the next survey.

In the following quarters, they expanded their analytics to include social media comments and competitor performance using sources like Twitter, discussion forums, and double-blind market research. To analyze this data, they used advanced machine learning techniques. This solution has helped generate $ 50 million in additional revenue for customers per year.

[ How can public data sets help with data science efforts? Read also: 6 misconceptions about AIOps, explained. ]

3. Data collection starts with small, readily available data

When we think about the prerequisites for machine learning, big data often comes to mind. But it’s a misconception that you need big data to deliver transformational business value. Many executives mistakenly assume that you have to collect millions of data points to uncover hidden business information.

Once you’ve focused on the goals, business issues, and analytical approach, the next step is to put together the data for analysis. Many business challenges can be solved with simple descriptive analyzes on small spreadsheets of data. By reducing the data entry barrier to a few hundred lines, you can manually gather data from systems, digitize paper records, or configure simple systems to capture the data you need.

Many executives mistakenly assume that you have to collect millions of data points to uncover hidden business information.

In another example, a mattress manufacturer we worked with wanted to use the analysis to improve production efficiency. As a midsize business early on in its data journey, the business had a small data footprint mostly made up of manually prepared spreadsheets. Rather than delaying further analysis, the company undertook a diagnostic analysis project to optimize performance.

He digitized the machine data available on paper, combined it with manually prepared data in a handful of spreadsheets, then used simple statistical techniques to analyze hundreds of rows and identify levers for optimization. By revealing information such as optimizing production batches to control temperature and humidity, the recommendations showed a potential yield improvement of 2.3%, which translated into top line sales. additional annual $ 400,000.

4. Take an incremental approach to deliver transformational value from data

The key to remember here is to avoid parking your data science initiatives because you have limited amounts of data. It is never a sequential process. Take a design thinking approach to identify the right business problems to solve. Adopt an agile methodology to design the right analytical approach to solve challenges. Finally, build an iterative process to get the data you need incrementally.

In most scenarios, it is impossible to pre-think all the potential data sources that you need. Evaluating advanced analysis techniques when you are just starting your data analysis journey is a waste of resources – it leads to overengineering and analysis paralysis. Remember that running data analysis projects yourself can help you build a solid roadmap for data engineering.

This iterative process is more efficient and effective, and it will add transformational value. Build momentum by delivering quick wins with byte-sized data analysis solutions. Focus on user adoption to help convert insights into business decisions and ultimately drive returns on investments.

[ Where is your team’s digital transformation work stalling? Get the eBook: What’s slowing down your Digital Transformation? 8 questions to ask. ]