Data Lakes are a superior way to store all your raw data and then be able to clean, analyze, and share that data within your organization. Hadoop is an open source technology that provides hundreds of ways to achieve a specific outcome and this keeps IT developers and business analysts confused. Even Hadoop vendors (Cloudera, Hortonworks) suggest various approaches to any problem.
At OvalEdge, we have defined a set of procedures and built our product around these procedures. This makes Data Lake implementation easy and straight-forward. At a high-level, you want to do the following with your Data Lake