Introduction
The goal of this blog is to introduce the Valorem Reader to data science, and more specifically, machine learning and predictive analytics. Valorem clients are clamoring for assistance to understand what machine learning might do for their company. In all the excitement, customers sometimes overlook the basics. In the development of predictive algorithms, being the rabbit in the race and skipping the basics will reduce the value provided by any model that is developed. Speed is your enemy. A thoughtful approach will allow the turtle to win the race and provide a robust, high performing predictive solution. Developing the right dataset to use as inputs into predictive modeling is the key ingredient to algorithm development success. It can be a slow and tedious process but it is a critical step – arguably the most important step in the machine learning process. 
Why the Tidal Wave of Interest Now?
There are several factors driving the interest in predictive analytics. It is a unique time in history where the confluence of several loosely correlated changes are occurring that drive the corporate passion (or should I say necessity) for machine learning and predictive analytics.
 
Data is now the key strategic business asset.
Everything that’s happening in the world around us - every purchase, upload, shipment, tweet, keystroke, sensor reading and customer interaction - is producing rich data that can help us create new experiences, new efficiencies, new business models and even new inventions. The International Data Corporation (IDC) projects that the digital universe will double every two years between now and 2020, and will reach 20 zettabytes (ZB) by 2020. IDC also estimates there will be approximately 5,247 GB of data for every man, woman and child on earth in 2020. Leveraging this data can be the differentiator for your business, IDC estimates companies that are leaders in using data assets to their advantage will capture $1.6 trillion more in business value. Gartner predicts that by 2020, 10% of organizations will have a highly profitable business unit specifically for productizing and commercializing their data.
We are living in the era of the 4th industrial revolution. 
At certain points in history, advancements in technology fundamentally disrupt the economic landscape. Some of history’s biggest revolutions occurred as a direct result of innovative technological developments. In the face of these disruptions, adapting is the key to survival and growth. Those who adapt successfully often emerge stronger than before. Those who don’t risk becoming obsolete. 
  • In the 1700’s mechanization, the steam engine, the weaving loom and iron production introduced factory systems that replaced manual production.
  • In the 1800’s, electrical energy, telegraph lines, mass production and the assembly line readily accelerated factory production systems. Output multiplied many times.
  • A century later introduced the digital revolution. This revolution was driven by the rise of digital computers and the internet. As these technologies became widespread, it enabled information to be generated and shared faster and more easily, opening up new possibilities for economic, social, and technological innovation.
  • Today marks the beginning of the 4th Industrial Revolution. We are in our infancy leveraging data to learn and automate many processes in a way that significantly accelerates decision making. The rise of predictive and prescriptive analytics, IoT and AI analytics disrupts every decision in every business.
Exponential technology change. 
Technological changes are happening more rapidly and the window of time that companies have to adapt is shrinking. The principle of adapt or get left behind should drive business strategy. We won’t experience 100 years of progress in the 21st century — it will be more like 20,000 years of progress. (Kurzweil)
  • A hundred years ago, the average lifespan of a company listed on the S&P 500 was 67 years. Today, that average lifespan of a company on the S&P 500 has decreased to just 15 years. It is becoming harder for companies to stay in the lead, or even in business, for very long.  
  • Richard Foster, a Yale University professor and former McKinsey partner, predicts that in the next decade, only 25% of the companies currently listed on the S&P 500 will remain there, meaning the other 75% will be replaced by new companies. These new companies will be those that take advantage of innovative technological capabilities to rapidly gain ground in their chosen markets. Similarly, the existing companies that remain on the index will be those that rapidly innovate and evolve their businesses.
  • In their book, The Second Machine Age, MIT economists, Eric Brynjolfsson and Andrew McAfee propose that big data, computation, and innovation are changing our economy and institutions with a magnitude greater than almost anything ever seen in history.
It’s the Data!

We’ve got data everywhere. It is being created faster than ever. We need to use this to keep up with the accelerating rate of change or we risk becoming irrelevant. Businesses stand to gain tremendous value from this trove of digital material, but doing so can be overwhelming. These datasets are vast, varied, and ever mutating. In order to derive value from these assets, businesses need advanced tools capable of quickly parsing out relevant information and making connections across disparate data sources, both inside and outside of the enterprise. How do we do this? There is only one answer: Machine Learning and AI. For both of these solutions, data is key – data is the new currency.

The most critical step you can take in developing data insights is first to learn why data is your most important asset and how to use it as inputs in the development of algorithms. Next, understand why your data right now is NOT suitable for algorithm development and what steps you must perform to prepare it. Valorem's Digital Insights Workshops can help you do just that!