tutorials|February 03, 2021|3 min read

Azure Storage Blob - How to List Blob, Download Blob from Azure Storage container in Python (pypy libs)

TL;DR

Use the official Azure Python SDK (azure-storage-blob) to authenticate, list blobs, and download files from Azure Storage containers with account key, SAS token, or proxy support.

Azure Storage Blob - How to List Blob, Download Blob from Azure Storage container in Python (pypy libs)

Introduction

In this tutorial we will see:

  • How to instantiate different classes required for talking to Azure storage container
  • How to authenticate
    • if we have account key
    • if we have sas_token
    • No Auth (Just container name)
  • How to use with Proxy
  • List Blobs for a storage container
    • Attributes of each blob object
  • Download blob

And, everything will be in Python

Pre-requisite

This tutorial is based upon Python-3.7

Pypy Dependency

We would require azure-storage-blob. Code is tested with version 12.7.1

How to Authenticate and Instantiate

from azure.storage.blob import BlobServiceClient

# consider a dictionary container
container = {
  'account_name': 'your_account_name',
  'container_name': 'your_container_name',
  'sas_token': 'xxxxxxx'
}

if "account_key" in container:
    blob_service = BlobServiceClient(
        account_url=account_url, credential=container["account_key"])
elif "sas_token" in container:
    blob_service = BlobServiceClient(
        account_url=account_url, credential=container["sas_token"])
else:
    blob_service = BlobServiceClient(account_url=account_url)

# Now to get instance of class which has list_blob methods
container_client = blob_service.get_container_client(container['container_name'])

In above code, we are just instantiating client classes required for the operation and authenticate. In my example, I have a sas_token.

Complete Example for list and download blobs (with proxy configuration as well)

import os
from azure.storage.blob import BlobServiceClient

def _create_dirs(dest_path):
    if not os.path.exists(dest_path):
        os.makedirs(dest_path)
    elif not os.path.isdir(dest_path):
        shutil.rmtree(dest_path)
        os.makedirs(dest_path)

def _get_container_service(container):
    account_url = f'https://{container["account_name"]}.blob.core.windows.net'
    
    proxies = None
    if 'proxy' in container:
        proxies = {'http': container['proxy']}
    # If 'proxy' isn't specified in container block, check if 'https_proxy' is set.
    elif 'https_proxy' in container:
        proxies = {'https': container['https_proxy']}

    # instantiate based upon credential
    if "account_key" in container:
        blob_service = BlobServiceClient(
            account_url=account_url, credential=container["account_key"], proxies=proxies)
    elif "sas_token" in container:
        blob_service = BlobServiceClient(
            account_url=account_url, credential=container["sas_token"], proxies=proxies)
    else:
        blob_service = BlobServiceClient(account_url=account_url, proxies=proxies)

    return blob_service.get_container_client(container['container_name'])

def download_blobs(container, dest_path):
    ## You might want to handle some exceptions here
    _create_dirs(dest_path)

    # Get the container instance
    blob_service = _get_container_service(container)

    # Note: list_blobs returns an iterator
    blob_list = blob_service.list_blobs()
    
    for blob in blob_list:
        fname = os.path.join(dest_path, blob.name)
        print(f'Downloading {blob.name} to {fname}')

        # get blob client which has download_blob method
        blob_client = blob_service.get_blob_client(blob)

        # create base dirs if not exists
        _create_dirs(os.path.dirname(fname))
        
        with open(fname, "wb") as download_file:
            download_file.write(blob_client.download_blob().readall())


## main starts here
local_dest_path = './container_blob'

container = {
    'account_name': 'your_account_name',
    'container_name': 'your_container_name',
    'sas_token': 'xxxxxxx'
}

download_blobs(container, local_dest_path)

Above script is very simple to understand. My container has nested directories and files. The code iterate over all files and downloads one by one.

Attributes of a Blob object

{
  'name': 'fdg/cert_discovery.fdg', 
  'snapshot': None, 
  'content': None, 
  'properties': {
    'blob_type': 'BlockBlob', 
    'last_modified': datetime.datetime(2019, 12, 2, 9, 42, 50, tzinfo=tzutc()), 
    'etag': '0x8D7770BFF1CC8A1', 
    'content_length': 423, 
    'content_range': None, 
    'append_blob_committed_block_count': None, 
    'page_blob_sequence_number': None, 
    'server_encrypted': True, 
    'copy': {
      'id': None, 
      'source': None, 
      'status': None, 
      'progress': None, 
      'completion_time': None, 
      'status_description': None
    }, 
    'content_settings': {
      'content_type': 'application/octet-stream', 
      'content_encoding': None, 
      'content_language': None, 
      'content_disposition': None, 
      'cache_control': None, 
      'content_md5': '3ycLC3CutKkybJtlgvEdsQ=='
    }, 
    'lease': {
      'status': 'unlocked', 
      'state': 'available', 
      'duration': None
    }, 
    'blob_tier': None, 
    'blob_tier_change_time': None, 
    'blob_tier_inferred': False, 
    'deleted_time': None, 
    'remaining_retention_days': None, 
    'creation_time': datetime.datetime(2019, 11, 28, 11, 52, 5, tzinfo=tzutc())
  },
  'metadata': None, 
  'deleted': False
}

Usage with only Python library, not Azure libraries

For usage without Azure libraries, see: List and Download Azure blobs by Python Libraries

Let me know if you face any difficulties, and I will try to resolve them.

Related Posts

Python SMTP Email Code - How to Send HTML Email from Python Code with Authentication at SMTP Server

Python SMTP Email Code - How to Send HTML Email from Python Code with Authentication at SMTP Server

Introduction This post has the complete code to send email through smtp server…

Python - How to Maintain Quality Build Process Using Pylint and Unittest Coverage With Minimum Threshold Values

Python - How to Maintain Quality Build Process Using Pylint and Unittest Coverage With Minimum Threshold Values

Introduction It is very important to introduce few process so that your code and…

Python - How to Implement Timed-Function which gets Timeout After Specified Max Timeout Value

Python - How to Implement Timed-Function which gets Timeout After Specified Max Timeout Value

Introduction We often require to execute in timed manner, i.e. to specify a max…

How to Solve Circular Import Error in Python

How to Solve Circular Import Error in Python

Introduction To give some context, I have two python files. (Both in same folder…

Python Code - How To Read CSV with Headers into an Array of Dictionary

Python Code - How To Read CSV with Headers into an Array of Dictionary

Introduction Lets assume we have a csv something similar to following: Python…

Python Code - How To Read CSV into an Array of Arrays

Python Code - How To Read CSV into an Array of Arrays

Introduction In last post, we saw How to read CSV with Headers into Dictionary…

Latest Posts

Claude Code Skills — Build a Better Engineering Workflow with AI-Powered Code Reviews, Security Scans, and More

Claude Code Skills — Build a Better Engineering Workflow with AI-Powered Code Reviews, Security Scans, and More

Most developers use Claude Code like a search engine — ask a question, get an…

Building an AI Voicebot for Visitor Check-In — A Practical Guide to Handling the Messy Parts

Building an AI Voicebot for Visitor Check-In — A Practical Guide to Handling the Messy Parts

Every office lobby has the same problem: a visitor walks in, nobody’s at the…

Server Security Best Practices — Complete Hardening Guide for Production Systems

Server Security Best Practices — Complete Hardening Guide for Production Systems

Every breach post-mortem tells the same story: an unpatched service, a…

Staff Engineer Study Plan for MAANG Interviews — The Complete 12-Week Roadmap

Staff Engineer Study Plan for MAANG Interviews — The Complete 12-Week Roadmap

If you’re a Senior Engineer (L5) preparing for Staff (L6+) roles at MAANG…

XSS and CSRF Explained — The Complete Guide with Real Attack Examples and Defenses

XSS and CSRF Explained — The Complete Guide with Real Attack Examples and Defenses

XSS and CSRF have been in the OWASP Top 10 for over a decade. They’re among the…

OWASP Top 10 (2021) — Every Vulnerability Explained with Code

OWASP Top 10 (2021) — Every Vulnerability Explained with Code

The OWASP Top 10 is the industry standard for web application security risks. If…