How To [WORK] Download File From Azure Blob Storage Using Python

1 view
Skip to first unread message

Annegret Mclean

unread,
Jan 20, 2024, 10:33:56 PM1/20/24
to tistacatcha

The Azure Storage Blobs client library for Python allows you to interact with three types of resources: the storageaccount itself, blob storage containers, and blobs. Interaction with these resources starts with an instance of aclient. To create a client object, you will need the storage account's blob service account URL and acredential that allows you to access the storage account:

how to download file from azure blob storage using python


Download ———>>> https://t.co/0Yst8pVrvO



To use an Azure Active Directory (AAD) token credential,provide an instance of the desired credential type obtained from theazure-identity library.For example, DefaultAzureCredentialcan be used to authenticate the client.

To use a shared access signature (SAS) token,provide the token as a string. If your account URL includes the SAS token, omit the credential parameter.You can generate a SAS token from the Azure Portal under "Shared access signature" or use one of the generate_sas()functions to create a sas token for the storage account, container, or blob:

Depending on your use case and authorization method, you may prefer to initialize a client instance with a storageconnection string instead of providing the account URL and credential separately. To do this, pass the storageconnection string to the client's from_connection_string class method:

Can someone tell me if it is possible to read a csv file directly from Azure blob storage as a stream and process it using Python? I know it can be done using C#.Net (shown below) but wanted to know the equivalent library in Python to do this.

I struggled lot for this I don't want anyone to do same,If you are using openpyxl and want to directly write from azure function to blob storage do following steps and you will achieve what you are seeking for.

I have 1000s of images sitting in a container on my blob storage. I want to process these images one by one in Python and spit out the new images out into a new container (the process is basically detecting and redacting objects). Downloading the images locally is not an option because they take up way too much space.

From the project directory, install packages for the Azure Blob Storage and Azure Identity client libraries using the pip install command. The azure-identity package is needed for passwordless connections to Azure services.

You can also authorize requests to Azure Blob Storage by using the account access key. However, this approach should be used with caution. Developers must be diligent to never expose the access key in an unsecure location. Anyone who has the access key is able to authorize requests against the storage account, and effectively has access to all the data. DefaultAzureCredential offers improved management and security benefits over the account key to allow passwordless authentication. Both options are demonstrated in the following example.

When developing locally, make sure that the user account that is accessing blob data has the correct permissions. You'll need Storage Blob Data Contributor to read and write blob data. To assign yourself this role, you'll need to be assigned the User Access Administrator role, or another role that includes the Microsoft.Authorization/roleAssignments/write action. You can assign Azure RBAC roles to a user using the Azure portal, Azure CLI, or Azure PowerShell. You can learn more about the available scopes for role assignments on the scope overview page.

To assign a role at the resource level using the Azure CLI, you first must retrieve the resource id using the az storage account show command. You can filter the output properties using the --query parameter.

When deployed to Azure, this same code can be used to authorize requests to Azure Storage from an application running in Azure. However, you'll need to enable managed identity on your app in Azure. Then configure your storage account to allow that managed identity to connect. For detailed instructions on configuring this connection between Azure services, see the Auth from Azure-hosted apps tutorial.

In this blog I will show you a very useful package in python that has come in very handy for me over the past few months while interacting with Azure blob storage for programmatically uploading, downloading and listing blobs from a container in an Azure storage account. For this I am using the azure.storage.blob package.

Now make sure you setup your environment variables with the connection string your Azure storage account. You can also use Azure key vaults for authentication instead of windows environment variables. I prefer Azure key vault and if you have not done that before I will be creating a blog post showing how you can use key vaults for authentication in python.

To search for blobs with specific tags in Azure Blob Storage using Python, you can use the find_blobs_by_tags method from the Azure Storage SDK. It takes as a single parameter filter_expression which can be used to query blobs with specific tags, container or name.

find_blobs_by_tags returns a iterable response of FilteredBlob objects. The return value is automatically paginated and you can control the number of Blobs per page with the results_per_page keyword argument. You can iterate the results by blob or by page using the iterator returned by the by_page function.

Voilà! All your images will from now on be saved as blobs in the Azurestorage container of your choice. The problem with this approach isthat now you have to keep your account key secure, which is not sotrivial to accomplish.

After processing some reports in Notebook server about our geodatabases, Enterprise portals ,and ArcGIS Online environments, I want to then write the csv results to a folder in an Azure Data Lake. However, those libraries aren't in the standard library list. With notebook server focused on data science and ML, I figure being able to read and write from Azure, AWS, and Google's Blob/Data Lake storage platforms would be a common use case.

Can you provide details about the exact libraries that you'd like to work with? There is, unfortunately, currently a broad set of Python APIs for accessing a variety of cloud services which makes their inclusion less simple. We do provide `azure-core` and `azure-storage-blob`, and if there are specific demands for others, can assess their inclusion.

Thanks!
Shaun

To communicate with Azure Blob Storage, the code makes use of the 'BlobServiceClient' class from the 'azure.storage.blob' package. We begin by using the connection string to create a "BlobServiceClient" object.

this client represents interaction with the Azure storage account itself, and allows you to acquire preconfiguredclient instances to access the containers and blobs within. It provides operations to retrieve and configure theaccount properties as well as list, create, and delete containers within the account. To perform operations on aspecific container or blob, retrieve a client using the get_container_client or get_blob_client methods.

this client represents interaction with a specific container (which need not exist yet), and allows you to acquirepreconfigured client instances to access the blobs within. It provides operations to create, delete, or configure acontainer and includes operations to list, upload, and delete the blobs within it. To perform operations on aspecific blob within the container, retrieve a client using the get_blob_client method.

This tier is used for storing objects that are actively and consistently read from and written to. The access costs for this tier are the lowest among the tiers, however, the storage costs are the highest.


One thing to note here is that you are given the option to choose the Access Tier and the Blob Type. The Access Tier determines the access vs storage costs of the blob while the Blob Type determines how the blob is optimized.

One of the easiest ways to accomplish this is through the Storage Explorer. In Storage Explorer, navigate to the blob you want to update and right-click then select 'Change Access Tier' from the menu.

A lease on a blob ensures that no one other than the leaseholder can modify or delete that object. Leases can be created for a set or infinite duration. The example below will initialize an infinite duration lease on the blob and lock other users from modifying it.

df19127ead
Reply all
Reply to author
Forward
0 new messages