mapillary_downloader
Mapillary data downloader.
mapillary_downloader.webp_converter
WebP image conversion utilities.
check_cwebp_available
def check_cwebp_available()
Check if cwebp binary is available.
Returns:
bool
- True if cwebp is found, False otherwise
convert_to_webp
def convert_to_webp(jpg_path, output_path=None, delete_original=True)
Convert a JPG image to WebP format, preserving EXIF metadata.
Arguments:
jpg_path
- Path to the JPG fileoutput_path
- Optional path for the WebP output. If None, uses jpg_path with .webp extensiondelete_original
- Whether to delete the original JPG after conversion (default: True)
Returns:
Path object to the new WebP file, or None if conversion failed
mapillary_downloader.exif_writer
EXIF metadata writer for Mapillary images.
decimal_to_dms
def decimal_to_dms(decimal)
Convert decimal degrees to degrees, minutes, seconds format for EXIF.
Arguments:
decimal
- Decimal degrees (can be negative)
Returns:
Tuple of ((degrees, 1), (minutes, 1), (seconds, 100)) as rational numbers
timestamp_to_exif_datetime
def timestamp_to_exif_datetime(timestamp)
Convert Unix timestamp to EXIF datetime string.
Arguments:
timestamp
- Unix timestamp in milliseconds
Returns:
String in format “YYYY:MM:DD HH:MM:SS”
write_exif_to_image
def write_exif_to_image(image_path, metadata)
Write EXIF metadata from Mapillary API to downloaded image.
Arguments:
image_path
- Path to the downloaded image filemetadata
- Dictionary of metadata from Mapillary API
Returns:
True if successful, False otherwise
mapillary_downloader.utils
Utility functions for formatting and display.
format_size
def format_size(bytes_count)
Format bytes as human-readable size.
Arguments:
bytes_count
- Number of bytes
Returns:
Formatted string (e.g. “1.23 GB”, “456.78 MB”)
format_time
def format_time(seconds)
Format seconds as human-readable time.
Arguments:
seconds
- Number of seconds
Returns:
Formatted string (e.g. “2h 15m”, “45m 30s”, “30s”)
mapillary_downloader.tar_sequences
Tar sequence directories for efficient Internet Archive uploads.
tar_sequence_directories
def tar_sequence_directories(collection_dir)
Tar all sequence directories in a collection for faster IA uploads.
Arguments:
collection_dir
- Path to collection directory (e.g., mapillary-user-quality/)
Returns:
Tuple of (tarred_count, total_files_tarred)
mapillary_downloader.ia_check
Check if collections exist on Internet Archive.
check_ia_exists
def check_ia_exists(collection_name)
Check if a collection exists on Internet Archive.
Arguments:
collection_name
- Name of the collection (e.g., mapillary-username-original-webp)
Returns:
Boolean indicating if the collection exists on IA
mapillary_downloader.__main__
CLI entry point.
main
def main()
Main CLI entry point.
mapillary_downloader.downloader
Main downloader logic.
get_cache_dir
def get_cache_dir()
Get XDG cache directory for staging downloads.
Returns:
Path to cache directory for mapillary_downloader
MapillaryDownloader Objects
class MapillaryDownloader()
Handles downloading Mapillary data for a user.
__init__
def __init__(client,
output_dir,
username=None,
quality=None,
workers=None,
tar_sequences=True,
convert_webp=False,
check_ia=True)
Initialize the downloader.
Arguments:
client
- MapillaryClient instanceoutput_dir
- Base directory to save downloads (final destination)username
- Mapillary username (for collection directory)quality
- Image quality (for collection directory)workers
- Number of parallel workers (default: half of cpu_count)tar_sequences
- Whether to tar sequence directories after download (default: True)convert_webp
- Whether to convert images to WebP (affects collection name)check_ia
- Whether to check if collection exists on Internet Archive (default: True)
download_user_data
def download_user_data(bbox=None, convert_webp=False)
Download all images for a user.
Arguments:
bbox
- Optional bounding box [west, south, east, north]convert_webp
- Convert images to WebP format after download
mapillary_downloader.worker
Worker process for parallel image download and conversion.
download_and_convert_image
def download_and_convert_image(image_data, output_dir, quality, convert_webp,
access_token)
Download and optionally convert a single image.
This function is designed to run in a worker process.
Arguments:
image_data
- Image metadata dict from APIoutput_dir
- Base output directory pathquality
- Quality level (256, 1024, 2048, original)convert_webp
- Whether to convert to WebPaccess_token
- Mapillary API access token
Returns:
Tuple of (image_id, bytes_downloaded, success, error_msg)
mapillary_downloader.logging_config
Logging configuration with colored output for TTY.
ColoredFormatter Objects
class ColoredFormatter(logging.Formatter)
Formatter that adds color to log levels when output is a TTY.
__init__
def __init__(fmt=None, datefmt=None, use_color=True)
Initialize the formatter.
Arguments:
fmt
- Log format stringdatefmt
- Date format stringuse_color
- Whether to use colored output
format
def format(record)
Format the log record with colors if appropriate.
Arguments:
record
- LogRecord to format
Returns:
Formatted log string
setup_logging
def setup_logging(level=logging.INFO)
Set up logging with timestamps and colored output.
Arguments:
level
- Logging level to use
add_file_handler
def add_file_handler(log_file, level=logging.INFO)
Add a file handler to the logger for archival.
Arguments:
log_file
- Path to log filelevel
- Logging level for file handler
mapillary_downloader.ia_meta
Internet Archive metadata generation for Mapillary collections.
parse_collection_name
def parse_collection_name(directory)
Parse username and quality from directory name.
Arguments:
directory
- Path to collection directory (e.g., mapillary-username-original or mapillary-username-original-webp)
Returns:
Tuple of (username, quality) or (None, None) if parsing fails
get_date_range
def get_date_range(metadata_file)
Get first and last captured_at dates from metadata.jsonl.gz.
Arguments:
metadata_file
- Path to metadata.jsonl.gz file
Returns:
Tuple of (first_date, last_date) as ISO format strings, or (None, None)
count_images
def count_images(metadata_file)
Count number of images in metadata.jsonl.gz.
Arguments:
metadata_file
- Path to metadata.jsonl.gz file
Returns:
Number of images
write_meta_tag
def write_meta_tag(meta_dir, tag, values)
Write metadata tag files in rip format.
Arguments:
meta_dir
- Path to .meta directorytag
- Tag namevalues
- Single value or list of values
generate_ia_metadata
def generate_ia_metadata(collection_dir)
Generate Internet Archive metadata for a Mapillary collection.
Arguments:
collection_dir
- Path to collection directory (e.g., ./mapillary_data/mapillary-username-original)
Returns:
True if successful, False otherwise
mapillary_downloader.client
Mapillary API client.
MapillaryClient Objects
class MapillaryClient()
Client for interacting with Mapillary API v4.
__init__
def __init__(access_token)
Initialize the client with an access token.
Arguments:
access_token
- Mapillary API access token
get_user_images
def get_user_images(username, bbox=None, limit=2000)
Get images uploaded by a specific user.
Arguments:
username
- Mapillary usernamebbox
- Optional bounding box [west, south, east, north]limit
- Number of results per page (max 2000)
Yields:
Image data dictionaries
download_image
def download_image(image_url, output_path)
Download an image from a URL.
Arguments:
image_url
- URL of the image to downloadoutput_path
- Path to save the image
Returns:
Number of bytes downloaded if successful, 0 otherwise