Tutorial Thursday: Automating satellite image cutting

This is a rather niche-y tutorial: First, I will briefly walk you through my typical workflow for processing satellite imagery. Then we’ll work out how to remove the middle-man QGIS and go directly via the command line to avoid the repetitive clicking and navigating in QGIS, awesome though that program is.

Usually, I start from an event or a development at a certain location which is visible from space. I browse the free satellite imagery websites I know of, most often the SentinelHub EO Browser , to see if whatever interests me is visible. If it is, I briefly celebrate and then download it in GeoTiff format. This is a raster image format which has geolocated pixels. Because of the many information in a GeoTiff, they tend to be rather large files – depending on the area of interest and the resolution, of course.

With SentinelHub, you have easy access to Sentinel and Landsat satellites, which are free to the public. Both are great resources, though Sentinel-2 has the best resolution available for free: ~9 meters per pixel. This is not high enough to make out cars or houses; the development of agricultural land, for example, which often is many hectares large, is very well visible, though.

In the meantime, I open QGIS, or QuantumGIS, a free, open-source program to work with all things geo (GIS stands for geo information system). I open the downloaded file and, when it’s more than one, I merge it via the corresponding GDAL (a library to work with geospatial data) command in the Processing Toolbox. Next, I crop the resulting file to my exact area of interest (AOI). Then I translate – speak “convert” – it to a PNG or JPEG file to reduce its size and have it in a format that can be published on our website.

Afterwards, I, or a colleague of mine, touch the image for some fine-tuning of colors, contrast etc. and maybe put date stamps and annotations on it. But this doesn’t concern us for today. Today, I want to show you another way, a superior way in some aspects, to the clicking and menu-navigating in QGIS.

You can accomplish all of this image processing in the command line as well and it’s really no magic. QGIS even tells you what it does on the command line, so I copied the functions for each step, edited them for my convenience, and put them in a shell script. The big advantage: You can run this script easily a couple of times with little changes and don’t have to click your way through the user interface of QGIS every time.

This is how my area of interest looks – via the very handy geojson.io

.

First, some prerequisites: You need a bash-based shell. If you’re working on a Linux or macOS machine, no problem, it’s the default; if you’re on a Windows machine it depends: With Windows 10 there is supposedly a way to get a bash shell, with older versions not so much (I duckduckgo’ed for a second and found this tutorial for Win10, might be useful). If you don’t have QGIS installed, chances are you don’t have GDAL installed, either. This script is based on GDAL, so visit GDAL.org for installation instructions or look it up in your favorite search engine. Also, sites like StackOverflow can help you.

You will also need GeoTiffs. As I mentioned before, I recommend using SentinelHub. Put the satellite GeoTiffs in a folder called “input” inside your working directory and make sure you have a date in the form of YYYY-MM-DD at the beginning of the filename. The script will look for this pattern to identify groups of .tifs that belong to the same date to merge them.

Next, you will need your area of interest in form of a GeoJSON. An easy tool to get this Geojson.io: draw, download, done. Name them how you want, for example “mobile.geojson” and “desktop.geojson”. To resize the resulting JPGs, you need to have imagemagick installed. Just like the satellite images, these GeoJSONs get their own folder titled “aoi”. If your working directory looks like this and has files inside the aoi and input directory, you’re good to go.

Now, let's go through the script, step by step.

The Header


	#!/bin/bash 
# why the previous line? I wondered, too; here lies the answer: https://stackoverflow.com/questions/8967902/why-do-you-need-to-put-bin-bash-at-the-beginning-of-a-script-file#8967916

# Before you proceed: Make sure you have a 
# directory called 'input' containing your tifs.
# You also need a directory named 'aoi' for your areas-of-interest.
# Also important and not optional:
#  The .tif files have to have the or a date at the front in the
#  form of YYYY-MM-DD or the script won't find them. 

# config your script variables

# remove tifs after creating the jpgs?
clean_up=true

# do you want to directly resize the images to a certain size?
resize=true     

target_srs=EPSG:3857    

# the names of your aoi GeoJSONs
aoi_1=mobile
aoi_2=desktop
# if resize=true, set the size of one side per aoi here:
aoi_1_height=640
# aoi_1_width=\480
# aoi_2_height=\900
aoi_2_width=1440
# you can try and go down to 80 or 75 
# if you're looking to reduce file size:
jpg_quality=92

Here, I define some variables that I will use later, so I can easily alter some aspects of the script by just changing this this header. With clean_up and resize I control whether the created intermediate tifs are deleted and whether I get resized images, respectively. The resizing is a bit tricky. For me, it worked to just define one side of the image, but please refer to the documentation for detailed information.

The Processing


echo "This may take a few minutes, depending on the amount of tifs. For 8 tifs with each at ~60MB it takes ~1 minute."

# Step Zero: Check and create directories
echo "Checking and building output directory:"

if test ! -d output/jpg/ ;
    then mkdir output ;
        echo "Directory 'output' does not exist; creating..." ;
fi

if test ! -d output/jpg/ ;
    then mkdir output/jpg/ ;
        echo "Directory 'output/jpg/' does not exist; creating..." ;
fi

if test ! -d output/tif/ ;
    then mkdir output/tif/ ;
    echo "Directory 'output/tif/' does not exist; creating..." ;
fi

The “echo” lines just print text that follows on the command line, so I have an idea what is happening when I run the script. The following if-clauses make sure that the directories needed for a tidy output exist – or are otherwise created.

Now to the juicy part: These lines have been adopted from the QGIS dialogues and modified for this script and my usage. Here, the filename comes into play: I loop through unique dates and merge them as a group. When that’s done, I use gdalwarp to crop the merged GeoTIFF to the area-of-interest I drew and downloaded from Geojson.io. I then convert it to JPG and, if I want to, I can automatically resize the JPG with the variables I set at the top of the script (For the project that started all this I did want to, so it’s the default).


for date in $(ls input | grep -oP "\d{4}-\d{2}-\d{2}" | uniq)
do 
    echo "Merging tifs for $date..."

    # Step One: Merge all tiffs in the folder
    gdal_merge.py -n 0 -a_nodata 0 -ot Byte -of GTiff -o output/tif/$date-merged.tif input/$date*.tif

    # Step Two: Cropping to cutline; converting/translating to jpg; resizing jpg

    # 1st AOI
    gdalwarp -of GTiff -cutline aoi/$aoi_1.geojson -crop_to_cutline -dstnodata 0.0 -t_srs $target_srs -overwrite output/tif/$date-merged.tif output/tif/$date-$aoi_1.tif

    gdal_translate -a_srs $target_srs -a_nodata 0.0 -ot Byte -of JPEG output/tif/$date-$aoi_1.tif output/jpg/$date-$aoi_1.jpg

    if $resize ; then 
    convert -quality $jpg_quality -resize $aoi_1_width\x$aoi_1_height output/jpg/$date-$aoi_1.jpg output/jpg/$date-$aoi_1.jpg ;
    fi

    # 2nd AOI
    gdalwarp -of GTiff -cutline aoi/$aoi_2.geojson -crop_to_cutline -dstnodata 0.0 -t_srs $target_srs -overwrite output/tif/$date-merged.tif output/tif/$date-$aoi_2.tif

    gdal_translate -a_srs $target_srs -a_nodata 0.0 -ot Byte -of JPEG output/tif/$date-$aoi_2.tif output/jpg/$date-$aoi_2.jpg
    
    if $resize ; then 
    convert -quality $jpg_quality -resize $aoi_2_width\x$aoi_2_height output/jpg/$date-$aoi_2.jpg output/jpg/$date-$aoi_2.jpg ;
    fi

    # removes the big merged tif to free space
    if $clean_up ;
        then rm output/tif/$date-merged.tif ;
    fi
done

If you have more than two areas-of-interest, you can just copy-paste the code block from “# 2nd AOI” to “fi”, the end of the if-clause, configure a third variable with the name of the aoi and replace the names in the copied code block.

The Clean-up

After the files have been created, you might want to tidy up the directory. This part of the code does exactly that for you if the clean_up variable is “true”. It checks and removes a bunch of files and should leave you with the finished JPGs.


# Step Three: Cleaning up the working directory

if $clean_up ;
    then echo "Variable 'clean_up' is set to 'true'; removing tifs..." ; 
        rm output/tif/*-$aoi_1.tif ;
        rm output/tif/*-$aoi_2.tif ;
fi

# removes unnecessary files
echo "Removing unnecessary XML files..."
rm output/jpg/*.xml

# if you didn't comment out the output-tif-removal above,
# the following will remove the empty folder
if test ! -d output/tif/*.tif ;
    then rm -r output/tif/ ;
        echo "The tif directory is empty; removing it for tidyness" ;
    else 
        echo "The tif directory is not empty; keeping it" ;
fi

echo "Yay, finished!"

You can download this script as a Github Gist. If you want to run it, go to your working directory, open a console/terminal and type sh gdal_merge_crop_convert.sh.

This concludes the programming part. However, in many cases you might want to touch the image in a photo editing program like Adobe Photoshop and set some colors right, because the sun and algorithms might tint it green-ish or yellow-ish.

Shortcomings:

The script doesn’t have an option for different image output formats right now and some more stuff could probably be refined. You’re very welcome to contribute to it through feedback or coding! You can contact me via Twitter for example.



About

Moritz Zajonz

Moritz Zajonz works at the German newspaper Süddeutsche Zeitung in the Entwicklungsredaktion. In this data-driven journalism and storytelling department he works on long projects as well as more news-oriented ddj.

Runs on:

How many pie charts have you built?
*An ancient memory pops into his mind. His conscience screams in agony.* “I... - None.”

How many items are on your desktop?
You know, I resisted the urge to just drop stuff on “Desktop” for so long. But I am just a human, weak. So: You can still see some of the wallpaper.

Swear words per day?
Correlates heavily with project intensity and deadlines.

How many adapters do you have?
One to rule them all! Well, and one so I can stick a LAN cable in my other laptop. And there’s also still the problem with VGA monitors...

Your funniest file name?
*.pdf

snow flake
© 2018 Journocode