Sunday, June 8, 2025

5 Key Classes for Google Earth Engine Learners | by Daniel Pazmiño Vernaza | Jan, 2025


Fingers-On Insights from a Python API consumer

Land cowl map for the Paute water bassin in Ecuador for the yr 2020. Picture created utilizing Google Earth Engine Python API and Geemap. Information supply: Friedl, M., Sulla-Menashe, D. (2022); Lehner, B., Grill G. (2013) and Lehner, B., Verdin, Okay., Jarvis, A. (2008).

As a local weather scientist, Google Earth Engine (GEE) is a strong instrument in my toolkit. No extra downloading heavy satellite tv for pc photographs to my laptop.

GEE main API is Javascript, though Python customers may also entry a strong API to carry out related duties. Sadly, there are fewer supplies for studying GEE with Python.

Nevertheless, I like Python. Since I realized that GEE has a Python API, I imagined a world of prospects combining the highly effective GEE’s highly effective cloud-processing capabilities with Python frameworks.

The 5 classes come from my most up-to-date venture, which concerned analyzing water stability and drought in a water basin in Ecuador. However, the guidelines, code snippets and examples might apply to any venture.

The story presents every lesson following the sequence of any knowledge evaluation venture: knowledge preparation (and planning), evaluation, and visualization.

It’s also price mentioning that I additionally present some basic recommendation unbiased of the language you employ.

This text for GEE newbies assumes an understanding of Python and a few geospatial ideas.

If you understand Python however are new to GEE (like me a while in the past), you need to know that GEE has optimized features for processing satellite tv for pc photographs. We received’t delve into the small print of those features right here; you need to test the official documentation.

Nevertheless, my recommendation is to test first if a GEE can carry out the evaluation you need to conduct. Once I first began utilizing GEE, I used it as a list for locating knowledge, relying solely on its fundamental features. I might then write Python code for a lot of the analyses. Whereas this method can work, it typically results in important challenges. I’ll talk about these challenges in later classes.

Don’t restrict your self to studying solely the fundamental GEE features. If you understand Python (or coding typically), the training curve for these features will not be very steep. Attempt to use them as a lot as potential — it’s price it by way of effectivity.

A last notice: GEE features even help machine studying duties. These GEE features are straightforward to implement and might help you resolve many issues. Solely if you can not resolve your drawback with these features must you take into account writing Python code from scratch.

For example for this lesson, take into account the implementation of a clustering algorithm.

Instance code with GEE features

# Pattern the picture to create enter for clustering
sample_points = clustering_image.pattern(
area=galapagos_aoi,
scale=30, # Scale in meters
numPixels=5000, # Variety of factors to pattern
geometries=False # Do not embody geometry to avoid wasting reminiscence
)

# Apply k-means clustering (unsupervised)
clusterer = ee.Clusterer.wekaKMeans(5).practice(sample_points)

# Cluster the picture
consequence = clustering_image.cluster(clusterer)

Instance code with Python

import rasterio
import numpy as np
from osgeo import gdal, gdal_array

# Inform GDAL to throw Python exceptions and register all drivers
gdal.UseExceptions()
gdal.AllRegister()

# Open the .tiff file
img_ds = gdal.Open('Sentinel-2_L2A_Galapagos.tiff', gdal.GA_ReadOnly)
if img_ds is None:
increase FileNotFoundError("The desired file couldn't be opened.")

# Put together an empty array to retailer the picture knowledge for all bands
img = np.zeros(
(img_ds.RasterYSize, img_ds.RasterXSize, img_ds.RasterCount),
dtype=gdal_array.GDALTypeCodeToNumericTypeCode(img_ds.GetRasterBand(1).DataType),
)

# Learn every band into the corresponding slice of the array
for b in vary(img_ds.RasterCount):
img[:, :, b] = img_ds.GetRasterBand(b + 1).ReadAsArray()

print("Form of the picture with all bands:", img.form) # (top, width, num_bands)

# Reshape for processing
new_shape = (img.form[0] * img.form[1], img.form[2]) # (num_pixels, num_bands)
X = img.reshape(new_shape)

print("Form of reshaped knowledge for all bands:", X.form) # (num_pixels, num_bands)

The primary block of code will not be solely shorter, however it can deal with the massive satellite tv for pc datasets extra effectively as a result of GEE features are designed to scale throughout the cloud.

Whereas GEE’s features are highly effective, understanding the restrictions of cloud processing is essential when scaling up your venture.

Entry to free cloud computing assets to course of satellite tv for pc photographs is a blessing. Nevertheless, it’s not shocking that GEE imposes limits to make sure truthful useful resource distribution. In case you plan to make use of it for a non-commercial large-scale venture (e.g. analysis deforestation within the Amazon area) and intend to remain throughout the free-tier limits you need to plan accordingly. My basic pointers are:

  • Restrict the sizes of your areas, divide them, and work in batches. I didn’t want to do that in my venture as a result of I used to be working with a single small water basin. Nevertheless, in case your venture includes giant geographical areas this is able to be the primary logical step.
  • Optimize your scripts by prioritizing utilizing GEE features (see Lesson 1).
  • Select datasets that allow you to optimize computing energy. For instance, in my final venture, I used the Local weather Hazards Group InfraRed Precipitation with Station knowledge (CHIRPS). The unique dataset has a each day temporal decision. Nevertheless, it provides an alternate model known as “PENTAD”, which supplies knowledge each 5 days. It corresponds to the sum of precipitation for these 5 days. Utilizing this dataset allowed me to avoid wasting laptop energy by processing the compacted model with out sacrificing the standard of my outcomes.
  • Look at the outline of your dataset, as it would reveal scaling elements that would save laptop energy. As an example, in my water stability venture, I used the Average Decision Imaging Spectroradiometer (MODIS) knowledge. Particularly, the MOD16 dataset, which is a available Evapotranspiration (ET) product. In response to the documentation, I might multiply my outcomes by a scaling issue of 0.1. Scaling elements assist cut back storage necessities by adjusting the info sort.
  • If worst involves worst, be ready to compromise. Scale back the decision of the analyses if the requirements of the examine permit it. For instance, the “reduceRegion” GEE operate enables you to summarize the values of a area (sum, imply, and so forth.). It has a parameter known as “scale” which lets you change the dimensions of the evaluation. As an example, in case your satellite tv for pc knowledge has a decision of 10 m and GEE can’t course of your evaluation, you may modify the dimensions parameter to a decrease decision (e.g. 50 m).

For example from my water stability and drought venture, take into account the next block of code:

# Scale back the gathering to a single picture (imply MSI over the time interval)
MSI_mean = MSI_collection.choose('MSI').imply().clip(pauteBasin)

# Use reduceRegion to calculate the min and max
stats = MSI_mean.reduceRegion(
reducer=ee.Reducer.minMax(), # Reducer to get min and max
geometry=pauteBasin, # Specify the ROI
scale=500, # Scale in meters
maxPixels=1e9 # Most variety of pixels to course of
)

# Get the outcomes as a dictionary
min_max = stats.getInfo()

# Print the min and max values
print('Min and Max values:', min_max)

In my venture, I used a Sentinel-2 satellite tv for pc picture to calculate a moisture soil index (MSI). Then, I utilized the “reduceRegion” GEE operate, which calculates a abstract of values in a area (imply, sum, and so forth.).

In my case, I wanted to search out the utmost and minimal MSI values to test if my outcomes made sense. The next plot reveals the MSI values spatially distributed in my examine area.

Month-to-month imply of moisture soil index values for the Paute basin (Ecuador) for the interval 2010–2020. Picture created utilizing Google Earth Engine Python API and Geemap. Information supply: European Area Company (2025) ; Lehner, B., Grill G. (2013) and Lehner, B., Verdin, Okay., Jarvis, A. (2008).

The unique picture has a ten m decision. GEE struggled to course of the info. Due to this fact, I used the dimensions parameter and lowered the decision to 500 m. After altering this parameter GEE was in a position to course of the info.

I’m obsessive about knowledge high quality. Because of this, I exploit knowledge however not often belief it with out verification. I like to take a position time in guaranteeing the info is prepared for evaluation. Nevertheless, don’t let picture corrections paralyze your progress.

My tendency to take a position an excessive amount of time with picture corrections stems from studying distant sensing and picture corrections “the previous manner”. By this, I imply utilizing software program that assists in making use of atmospheric and geometric correction to pictures.

These days, scientific businesses supporting satellite tv for pc missions can ship photographs with a excessive stage of preprocessing. Actually, a fantastic characteristic of GEE is its catalogue, which makes it straightforward to search out ready-to-use evaluation merchandise.

Preprocessing is essentially the most time-consuming process in any knowledge science venture. Due to this fact, it should be appropriately deliberate and managed.

One of the best method earlier than beginning a venture is to ascertain knowledge high quality requirements. Primarily based in your requirements, allocate sufficient time to search out the perfect product (which GEE facilitates) and apply solely the required corrections (e.g. cloud masking).

In case you love programming in Python (like me), you may typically end up coding every part from scratch.

As a PhD pupil (beginning with coding), I wrote a script to carry out a t-test over a examine area. Later, I found a Python library that carried out the identical process. Once I in contrast my script’s outcomes with these utilizing the library, the outcomes have been appropriate. Nevertheless, utilizing the library from the beginning might have saved me time.

I’m sharing this lesson that can assist you keep away from these foolish errors with GEE. I’ll point out two examples of my water stability venture.

Instance 1

To calculate the water stability in my basin, I wanted ET knowledge. ET will not be an noticed variable (like precipitation); it should be calculated.

The ET calculation will not be trivial. You possibly can lookup the equations in textbooks and implement them in Python. Nevertheless, some researchers have revealed papers associated to this calculation and shared their outcomes with the neighborhood.

That is when GEE is available in. The GEE catalogue not solely supplies noticed knowledge (as I initially thought) but additionally many derived merchandise or modelled datasets (e.g. reanalysis knowledge, land cowl, vegetation indices, and so forth.). Guess what? I discovered a ready-to-use international ET dataset within the GEE catalogue — a lifesaver!

Instance 2:

I additionally take into account myself a Geographic Info System (GIS) skilled. Over time, I’ve acquired a considerable quantity of GIS knowledge for my work akin to water basin boundaries in shapefile format.

In my water stability venture, my instinct was to import my water basin boundary shapefile to my GEE venture. From there, I reworked the file right into a Geopandas object and continued my evaluation.

On this case, I wasn’t as fortunate as in Instance 1. I misplaced treasured time attempting to work with this Geopandas object which I couldn’t combine effectively with GEE. In the end, this method didn’t make sense. GEE does have in its catalogue a product for water basin boundaries that’s straightforward to deal with.

Thus, a key takeaway is to take care of your workflow inside GEE at any time when potential.

As talked about in the beginning of this text, integrating GEE with Python libraries could be extremely highly effective.

Nevertheless, even for easy analyses and plots, the combination doesn’t appear easy.

That is the place Geemp is available in. Geemap is a Python bundle designed for interactive geospatial evaluation and visualization with GEE.

Moreover, I additionally discovered that it may help with creating static plots in Python. I made plots utilizing GEE and Geemap in my water stability and drought venture. The photographs included on this story used these instruments.

GEE is a strong instrument. Nevertheless, as a newbie, pitfalls are inevitable. This text supplies suggestions and tips that can assist you begin on the fitting foot with GEE Python API.

European Area Company (2025). European Area Company. (12 months). Harmonized Sentinel-2 MSI: MultiSpectral Instrument, Degree-2A.

Friedl, M., Sulla-Menashe, D. (2022). MODIS/Terra+Aqua Land Cowl Sort Yearly L3 International 500m SIN Grid V061 [Data set]. NASA EOSDIS Land Processes Distributed Energetic Archive Heart. Accessed 2025–01–15 from https://doi.org/10.5067/MODIS/MCD12Q1.061

Lehner, B., Verdin, Okay., Jarvis, A. (2008): New international hydrography derived from spaceborne elevation knowledge. Eos, Transactions, AGU, 89(10): 93–94.

Lehner, B., Grill G. (2013): International river hydrography and community routing: baseline knowledge and new approaches to review the world’s giant river programs. Hydrological Processes, 27(15): 2171–2186. Information is out there at www.hydrosheds.org

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles

PHP Code Snippets Powered By : XYZScripts.com