tutorials2 Min Read

Azure Storage Blob - How to List and Download Blob from Azure Storage container in Python (No Azure library)

Gorav Singal

February 04, 2021

TL;DR

Use Python's requests library to call Azure Storage REST APIs directly to list and download blobs without any Azure SDK dependency.

Azure Storage Blob - How to List and Download Blob from Azure Storage container in Python (No Azure library)

Introduction

In this tutorial we will see, How to list and download storage container blobs without using Azure python libraries.

Note: There is no azure library used, just rest api calls.

Pre-requisite

This tutorial is based upon Python-3.7

Pypy Dependency

We would require requests.

Complete Code

import requests
import re
import os

def _get_file_list_helper(container, next_marker=None):
  """
  Get the files list by using next_marker
  """
  account_name = container['account_name']
  container_name = container['container_name']
  curl_url = f'https://{account_name}.blob.core.windows.net/{container_name}?restype=container&comp=list&' 
  if next_marker:
    curl_url += f'marker={next_marker}&'
  curl_url += container['sas_token']

  print('Executing rest call to azure')
  r = requests.get(curl_url)
  text = r.text

  # this marker indicates there are more files
  next_marker = re.findall('<NextMarker>([^<]*)</NextMarker>',text)
  file_names = re.findall('<Name>([^<]*)</Name>',text)

  return {'files': file_names, 'next_marker': next_marker}  

def get_file_list(container):
  """
  Get the files list
  """
  files = []
  next_marker = None
  while True:
    files_data = _get_file_list_helper(container, next_marker)
    files.extend(files_data['files'])
    if not files_data['next_marker']:
      break
    next_marker = files_data['next_marker'][0]
  return files

def dowload_files(container, local_dest_path):
  files = get_file_list(container)

  account_name = container['account_name']
  container_name = container['container_name']
  url_path = f'https://{account_name}.blob.core.windows.net/{container_name}/'
  url_end_path = '?'  + container['sas_token']

  for file_name in files:
    print(f'Downloading: {file_name}')
    url = f'{url_path}{file_name}{url_end_path}'
    path = f'{local_dest_path}/{file_name}'
    if not os.path.exists(os.path.dirname(path)):
      os.makedirs(os.path.dirname(path))

      # make the request
      r = requests.get(url)

    # write the file
    with open(path, "wb") as download_file:
      download_file.write(r.content)

## main starts here
local_dest_path = './container_blob'

container = {
    'account_name': 'account_name',
    'container_name': 'container_name',
    'sas_token': 'xxxxxxxxxx'
}
dowload_files(container, local_dest_path)

Explanation

The code is very simple to understand. We are using Azure REST APIs to list and download storage blobs.

next_marker understanding

In cases, where there are more files in your storage container. The response does not have all the files in one response call. It instead returns a fixed number of items and a next_marker. Which indicates, there are more files. This marker has to be sent in next requests.

Usage with Azure Official Python Libraries

For usage with Azure official Python libraries, see: List and Download Azure blobs by Azure Python Libraries

Response to get blob Rest API

<?xml version="1.0" encoding="utf-8"?><EnumerationResults ServiceEndpoint="https://hubbledmeprodlocb.blob.core.windows.net/" ContainerName="container_name">
  <Blobs>
    <Blob>
      <Name>abc/test.log</Name>
      <Properties>
        <Last-Modified>Mon, 02 Dec 2019 09:42:50 GMT</Last-Modified>
        <Etag>0x8D7770BFF1CC8A1</Etag>
        <Content-Length>423</Content-Length>
        <Content-Type>application/octet-stream</Content-Type>
        <Content-Encoding /><Content-Language />
        <Content-MD5>3ycLC3CutKkybJtlgvEdsQ==</Content-MD5>
        <Cache-Control />
        <Content-Disposition />
        <BlobType>BlockBlob</BlobType>
        <LeaseStatus>unlocked</LeaseStatus>
        <LeaseState>available</LeaseState>
        <ServerEncrypted>true</ServerEncrypted>
      </Properties>
    </Blob>
  ...
  </Blobs>
  <NextMarker>marker_id</NextMarker>

</EnumerationResults>

Hope it helps.

Share

Related Posts

Python SMTP Email Code - How to Send HTML Email from Python Code with Authentication at SMTP Server

Python SMTP Email Code - How to Send HTML Email from Python Code with Authentication at SMTP Server

Introduction This post has the complete code to send email through smtp server…

Python - How to Maintain Quality Build Process Using Pylint and Unittest Coverage With Minimum Threshold Values

Python - How to Maintain Quality Build Process Using Pylint and Unittest Coverage With Minimum Threshold Values

Introduction It is very important to introduce few process so that your code and…

Python - How to Implement Timed-Function which gets Timeout After Specified Max Timeout Value

Python - How to Implement Timed-Function which gets Timeout After Specified Max Timeout Value

Introduction We often require to execute in timed manner, i.e. to specify a max…

How to Solve Circular Import Error in Python

How to Solve Circular Import Error in Python

Introduction To give some context, I have two python files. (Both in same folder…

Python Code - How To Read CSV with Headers into an Array of Dictionary

Python Code - How To Read CSV with Headers into an Array of Dictionary

Introduction Lets assume we have a csv something similar to following: Python…

Python Code - How To Read CSV into an Array of Arrays

Python Code - How To Read CSV into an Array of Arrays

Introduction In last post, we saw How to read CSV with Headers into Dictionary…

Latest Posts

AI Video Generation in 2025 — Models, Costs, and How to Build a Cost-Effective Pipeline

AI Video Generation in 2025 — Models, Costs, and How to Build a Cost-Effective Pipeline

AI video generation went from “cool demo” to “usable in production” in 2024-202…

AI Models in 2025 — Cost, Capabilities, and Which One to Use

AI Models in 2025 — Cost, Capabilities, and Which One to Use

Choosing the right AI model is one of the most impactful decisions you’ll make…

AI Image Generation in 2025 — Models, Costs, and How to Optimize Spend

AI Image Generation in 2025 — Models, Costs, and How to Optimize Spend

Generating one image with AI costs between $0.002 and $0.12. That might sound…

AI Coding Assistants in 2025 — Every Tool Compared, and Which One to Actually Use

AI Coding Assistants in 2025 — Every Tool Compared, and Which One to Actually Use

Two years ago, AI coding meant one thing: GitHub Copilot autocompleting your…

AI Agents Demystified — It's Just Automation With a Better Brain

AI Agents Demystified — It's Just Automation With a Better Brain

Let’s cut through the noise. If you read Twitter or LinkedIn, you’d think “AI…

Supply Chain Security — Protecting Your Software Pipeline

Supply Chain Security — Protecting Your Software Pipeline

In 2024, a single malicious contributor nearly compromised every Linux system on…