Actionable Data Analytics
Join Our Email List for Data News Sent to Your Inbox

Create an Azure Data Lake

Do you want to create an Azure Data Lake? In this blog post, I’ll share the steps to create an Azure Data Lake using the Azure portal. Unlike many other services, this is a specific configuration within Azure Storage Accounts, and you must enable Hierarchical Namespaces.  

Create an Azure Data Lake 

First, find the “Storage accounts” service in your Azure Portal. Azure Data Lake is a configuration of Azure Storage Accounts. 

find the “Storage accounts” service in your Azure Portal

There is also a shortcut on the left navigation pane. 

There is also a shortcut on the left navigation pane. 

Then, select “Create.” 

Finally, you can start configuring the main options to create your Azure Data Lake. 

Basic Configuration 

  1. Project details 

Select the subscription and the resource group where you’d like to create the Azure Data Lake. 

2. Storage account details 

  • Storage account name (This field can only contain lowercase letters and numbers. The account name must be between 3 and 24 characters). My naming convention: 
  • dls (data lake storage) +   descriptive short name + 3 letter region + 3 letter environment 
  • Region – select the region where you want to host your service. Microsoft will continue to add more regions. 
  • Performance and Redundancy – see image below 
Create a storage account

Advanced 

  1. For security reasons, I suggest always disabling Blob Public Access and Account Key Access. You can enable these options after creating the service if required. 
  1. Important: enable Hierarchical Namespaces -this option needs to be enabled to create an Azure Data Lake. 
Important: enable Hierarchical Namespaces -this option needs to be enabled to create an Azure Data Lake. 

Networking 

Next, I highly recommend using Private endpoints if possible when creating your Azure Data Lake. This requires additional configuration steps to enable connectivity with other services. 

Networking

Data Protection 

Modify any data recovery options as required. Unlike normal storage accounts, tracking capabilities are not available in Azure Data Lakes. 

Data protection in Azure

Encryption 

If you want to use your own keys to encrypt the information at rest, you can configure them under this section: 

Encryption

Tags 

Make sure you create/include some tags to facilitate administration before you create your Azure Data Lake. 

Tags in Storage account

Final Steps 

This is the last screen before you create your Azure Data Lake! 

Validation passed
Deployment is complete

Access Your Azure Data Lake 

Once your Azure Data Lake is created, you can navigate folders using Azure Portal or client tools like Azure Storage Explorer. In a Data Lake, access is managed using Role-based access control (RBAC). 

Grant Access to your Azure Data Lake 

You can grant access by using RBAC. 

Access your Azure Data lake

In this case, select Storage Blob Data Contributor. 

Add role assignment

 Select members as described below: 

Add role assignment

Finally, review and assign. 

review and assign roles

Create Containers and Folders 

Now, you can create containers and folders. 

Create a container as described below: 

create containers and folders

Create a folder inside the container to test permissions. 

add directory

Summary 

To summarize, creating an Azure Data Lake is an option within Azure Storage Accounts. There are many options related to security and networking that must be considered, but it’s really easy to get started. 

Check out these other posts

comment [ 0 ]
share
No tags 0

No Comments Yet.

Do you want to leave a comment?

Your email address will not be published. Required fields are marked *