In continuation to my series about what Big Data can do or can't do for you, here's another thought of mine.
First of all I don't want to say that Big Data is not worth it. All this post is about is to make you ask a question to yourself, everyone knows there are risks with big data, just ask yourself are you willing to take the risk, IS IT WORTH IT?
This question is for both companies starting with big data or yet in experimentation phase, and for the people starting with big data ( I have already written http://bigdatabuff.blogspot.in/2015/10/reasons-why-moving-to-big-data-can-be.html for you)
The reason I am writing this post is because I have seen companies starting with big data, experimenting with it for sometime and then just drop it because some mistake convince them that its not worth it. I hope to give insights to what you can or can't expect big data to do for you.
Lets chat then.
The common mistakes or problems are:
- Not giving enough time: Time is a huge problem for companies now a days. Companies start on big data, build R&D teams of experts, pay them heavily and expect the outcome in a month. R&D is not just development, it has research in it for a reason. It takes time, it can fail at times; more times than succeeding. Don't expect short term profits from it. If you can't wait and don't want to "waste" your manpower on it just outsource it, because building something innovative takes time.
- Misconception: At times people are under the impression that big data is some sort of magic that makes system so fast that you can do everything in milliseconds (Yep people still believe in magic :P). Its big data but its still computer science, all the logics and limitations of computers still apply to it people. Its logic not magic (surprize!!!!). There is a reason that everytime anyone talks about the making a process fast, they also give hardware specifications. Ever heard about terasort benchmark? http://sortbenchmark.org/
- Too much experimenting: Well its opposite of your problem1. Just because you are doing R&D doesn't mean you have all the time in the world people!!! Sometimes the problem statements or assumptions you create in your head are wrong. Just because your hypothetical data didn't run as fast as you thought with big data tools, doesn't mean your actual data won't. The millisecond optimizations are to be done with real data. If you go on experimenting with every big data tool on earth with just assumptions (not their architecture) then I am sorry my friend, the list is too long, it'll take a life time to complete.
- Too Haphazard architecture: Its a different variant of problem 1. At times people actually don't have a choice other than to use big data. Then at times they make the mistake of taking the first thing they get their hands on and create a system in a haphazard manner. You don't need to look at every big data tool, but atleast look at enough to make sure you tried most of the feasible alternatives.
Big Disclaimer: I am not here to criticize anyone or to say that nobody is using big data the right way. I just wrote this post to point out common mistakes made by companies starting with big data, so that they can be avoided ( I hope I am helping someone :) )
If you can think of some other way someone or even you messed up something like this (we all make mistakes at times), please share in the comments. I would be glad to hear them.
Also if you don't agree with something in my post or if I seem rude at times (although I don't mean to), do tell me.
Thanks for bearing with me. Stay tuned for more :)