I have had a lot of conversations with customers to help them understand how to design a data lake. I touched on this in my blog Data lake details, but that was written a long time ago so I wanted to update it. I often find customers do not spend enough time in designing a data lake and many times have to go back and redo their design and data lake build-out because they did not think through all their use cases for data. So make sure you think through all the sources of data you will use now and in the future, understanding the size, type, and speed of the data. Then absorb all the information you can find on data lake architecture and choose the appropriate design for your situation.
The concepts are simple but there are some interesting implications to what James includes as well as additional resources, so check it out.