Big Data, Big Rubbish?
“Answers and Tips to prevent your Big Data Deployments from becoming Big Rubbish”
Big Data has been trending in social media for years now. The concept is not a new one and it includes a set of different technologies that provide different applications for different industries. Leveraging from diverse Big Data sets that are available internally or externally, Big Data is characterized by the variety, the volume and the velocity that is produced and therefore, cannot be processed or analysed by more conventional methods such as Databases.
However, there is still a lot of confusion around the mix of technologies that are required, with productised solution platforms currently available in the market, still at their very early stage and with a total cost of ownership in order to gather, store, maintain and analyse Big Data still very high.
So, let’s agree on one thing; Big Data still remains a hype and a buzzword rather than a feasible solution option, especially for organisations that are not large enough to invest big on Big Data.
On the other side, and by all means, Big Data investments will certainly be proven to be beneficial in the near future, especially when combined with analytics capabilities and a clear vision for the future. This article provides answers to key questions and tips for creating a clear strategy to deliver Big Insights and prevent Big Data investment from becoming Big Rubbish.
What should a Big Data solution be?
As for most problems in today’s complicated world, there is not a single solution that fits all problems. You will need to identify the optimal mix of technologies and create a scalable, flexible environment that will serve your needs and plans for your future and align your business objectives and strategic priorities.
But, as Big Data will continue to grow and become even Bigger, Advanced Analytics always needs to be the core component in any Big Data solution in order to effectively operationalise and optimise data related processes and derive actionable insights and business value.
Big Data without Advanced Analytics capabilities will simply result in data overload, noise and unavoidably Big Rubbish. It is not the increasing volume of Big Data that will answer your business problems or the increasing computational power or storage capacity, but the analytical algorithms that will allow you to mine Big Data sets and discover hidden patterns and associations.
What type of Big Data do I need?
Again, there is not a single golden rule that should always be followed, as the answer relies on the industry, the activities and the environment in which each organisation operates, and most importantly on the business problem that needs to be addressed.
Big Data sources vary in terms of structure, origin and nature. For example, for a financial institution that aims to improve customer service, Big Data can originate from structured CRM/Marketing, Sales and Transactional data sets, unstructured Social Media text and external Social Economic data from surveys. For a manufacturing company that wants to reduce production downtime, Big Data can originate from Asset Management Systems, Supervisory control and data acquisition (SCADA) systems, Inspections and Production yields, and also Weather Data and Internet of Things (IoT) Sensors readings from machines’ equipment.
How Big should my Big Data be?
Analysing Big Data sources can result in Big Insights, however they can also result in Big Noise and Big Rubbish. A balance between Big Data sources and targeted, representative samples (Small Data), and also between traditional modelling techniques and artificial intelligence, deep learning and machine learning algorithms, should be maintained. Variable selection and sampling is still important for an accurate statistical model and access to Big Data is the key component for that. It will provide data scientists the required bandwidth to build differentiated analytical models that can address different business problems and reveal associations and correlations, while Small Data, or appropriate relevant data samples will help you to reveal the root causes and produce accurate predictions and targeted actions for future events.
What other things should I consider?
Even if Big Data is technology-intensive deployments, peoples’ skills are still and will remain important for deriving business value from data processes. They will prevent your Big Data efforts becoming Big Rubbish and will increase return on investment. Experienced data scientists with appropriate skills and training, combined with a project governance system will definitely add clarity to your efforts and create a competitive advantage for the future.