With Azure Synapse Link for Azure Cosmos DB you can get near real-time insights on operational data without affecting your operational data store. By using a native Azure Cosmos DB analytical data store, you can have a fully isolated column store.
Today we are going to create a link to Azure Cosmos DB SQL API that will allow you to query the database, without needing to build an additional ETL/ELT process or writing code!
Table of Contents
I have created the following sections to finalise the configuration:
- Prepare Azure Cosmos Database SQL API
- Create Azure Synapse Link for Azure Cosmos DB SQL API
- Query Data
Prepare Azure Cosmos Database SQL API
Before we create the linked service in Azure Synapse Analytics, we need to enable the feature in Azure Cosmos DB.
The next step is to create or modify a container with the analytical store:
Modify the container to use the analytical store in the Settings section:
If you are building a new container (as was my case), see the attached additional steps below so you can add some items (you can copy them from this link or the description below):
Insert documents (one by one):
{
"id": "1",
"customerid": 101,
"firstname": "Orlando",
"lastname": "Gee",
"email": "[email protected]",
"addresses": [
{ "addressline1" : "660 Lindbergh", "city" : "Saint Louis", "stateprovince" : "Missouri", "countryregion" : "United States", "postalcode" : "63103" },
{ "addressline1" : "Ontario Mills", "city" : "Ontario", "stateprovince" : "California", "countryregion" : "United States", "postalcode" : "91764" }
]
}
{
"id": "2",
"customerid" : 122,
"title" : "Ms.",
"firstname" : "Caroline",
"middlename" : "A.",
"lastname" : "Vicknair",
"companyname" : "World of Bikes",
"salesperson" : "adventure-works\\jillian0",
"addresses": [
{ "addressline1" : "72540 Blanco Rd.", "addressline2" : "South Side", "city" : "San Antonio", "stateprovince" : "Texas", "countryregion" : "United States", "postalcode" : "78204" }
]
}
{
"id": "3",
"customerid": 108,
"title" : "Mr.",
"firstname" : "Robert",
"middlename" : "R.",
"lastname" : "Vessa",
"suffix" : "Esq.",
"companyname" : "Totes & Baskets Company",
"salesperson" : "adventure-works\\jillian0",
"emailaddress" : "[email protected]",
"phone" : "560-555-0171"
}
You can visualize the items in Azure Cosmos DB before moving on to the final configuration in Azure Synapse Analytics.
Now, we are ready to finalize the configuration in Azure Synapse Analytics.
Create Link for Azure Cosmos DB in Azure Synapse Analytics
In Azure Synapse Analytics, we can create a link by using the following hubs:
- Azure Synapse Analytics Data Hub (click to know more about the Data Hub)
- Azure Synapse Analytics Manage Hub (click to know more about the Manage Hub)
Data Hub
In the Data Hub section, we can link Azure Cosmos DB SQL API and external data. Make sure you select Azure Cosmos DB (SQL API) for this tutorial.
Manage Hub
In the Manage Hub section, you can create a linked service to your Azure Cosmos DB SQL API. Select Azure Cosmos DB (SQL API) for this tutorial.
Both options (Data Hub and Manage Hub) will display the following screen to create the link:
Query data
Now you are ready to start consuming data from Azure Cosmos DB by using notebooks.
Azure Synapse Analytics will pre-populate all the information for you in the notebook:
Finally, you are ready to start executing notebooks and visualize information using your ApacheSpark cluster!
Summary
I am amazed by Azure Synapse Link for Azure cosmos DB. There is no need to build an ETL process to query the operational data. It only takes around 5 seconds for data to be replicated in the Azure Cosmos DB analytical datastore which is impressive.
The templates in Azure Synapse Analytics even allow you to query the information without writing a single line of code!
What’s next?
Come back soon for more posts on Azure Synapse Analytics’ exciting features!