Like their bigger brethren, mid-sized companies are struggling to manage tens of terabytes of data about their customers, markets and products a veritable gold mine of information, if only they knew how to excavate it.
In the last two years alone, businesses have generated more data than we saw in the previous 60 years. Thanks to innovations in deduplication, compression, incremental increases in hard drive density and falling solid state drive (SSD) prices, companies are finding ways to store the massive influx of data. The real challenge, however, goes beyond storage. This data is rich with intelligence that could inform business strategy, reduce costs and drive growth, but few smaller companies have the budgets or the staff to unlock its potential. These businesses need a solution that can provide answers and intelligence without breaking budgets or requiring a data scientist.
The volume of data is daunting enough; the type of data, however, magnifies the challenge. Structured data only accounts for about 20 percent of stored information. The rest unstructured data includes social media feeds, emails, blogs, Microsoft Office documents, photos, videos and many more.
This data typically lives in a variety of locations across an organization and is rarely managed explicitly. The companies that do attempt to manage these unstructured sources generally use document management systems, which often end up as yet another information silo, like email applications, file shares and corporate intranets. In a study conducted last year, IDC found that these silos are part of the reason why information workers lose nearly 20 percent of their time to inefficiencies.
It's easy to see why unstructured data lies dormant. It is often created and consumed ad-hoc and isn't organized for ease of access. It doesn't have a clearly defined schema, and companies typically lack the tools and expertise to massage, visualize and manipulate this data to identify valuable information and inform decisions.
This challenging data is at the core of the big data problem with which companies of all sizes are grappling. At the large enterprise end of the spectrum, the solution comes in the form of a cluster of servers and one or more data scientists (and the large costs associated with them). Some smaller companies, in their scramble to keep up, have tried to train their business analysts to do data scientist-level work. What the latter really need is a solution that can automatically transform their data into intelligence and present the right data to the right people at the right time.
Data intelligence questions
The first priority in solving the unstructured, big data problem for smaller companies is putting the data in context. For example, who in my company knows about Customer X? Where is the latest version of the contract and who hasn't read it that should? These companies need to know what data is being produced by which departments and individuals, and they need to know how their teams are using that information. Answering these questions raises numerous other ones, including:
Sign up for CIO Asia eNewsletters.