Update

What Technologies Are Available to Build a Data Lake?

Big data technologies can cover a range of enterprise analytics needs, but there are some specific challenges in processing transactional data. For example, bank payments need to update account balances and records of payments instantly. While SQL technologies will eventually cover many of these needs, big data platforms will likely still require some custom solutions. This is where data prep tools come into play. Here are some technologies to consider. Once you understand the challenges, you’ll be better able to decide which technologies to use.

Several cloud-based platforms are available to enable data lakes. Amazon Web Services (AWS) is a popular option. Its data lake solution includes Lambda microservices, Amazon Elasticsearch, Cognito user authentication, and Amazon Athena analytics. Microsoft Azure’s data lake offering includes Hadoop and its services. Azure’s data lake offering is also built on Microsoft’s cloud platform, but is less of a one-stop-shop. However, the company boasts that Twitter and Facebook use its platform.

The Data Lake complements a data warehouse, enabling enterprises to store old data for historical analysis and staging. Early data lakes used the Hadoop distributed file system. Hadoop is an open-source data processing framework, which used MapReduce to split computational tasks into smaller tasks and run on commodity hardware. With the availability of cloud data management solutions, businesses can now be confident that future projects will have access to their data lake.

Related Articles

Leave a Reply

Back to top button