Understanding and Filling Gaps in Rasters
Rasters can sometimes have holes, also known as voids, gaps, or NoData areas. These gaps can be big and obvious, or small and hard to notice. So, how do you fill in these gaps with meaningful values while keeping the existing data intact?
There are several ways to do this. Let’s go over some background information, key points to consider, and four common solutions.
What is NoData?
A raster is a way to organize data about something like categories, measurements, heights, or image reflections, into a grid of equally sized cells. Sometimes, there’s not enough information to assign a value to a cell, either from the data source or after analysis. That’s where the concept of NoData comes in.
When showing rasters, you can set NoData cells to display in a chosen color or not show at all (making them transparent).
For analysis, how NoData cells are handled depends on the tool you’re using. Some tools will keep these cells as NoData in the output, while others might fill them based on available values. The documentation for each tool usually explains its NoData behavior, but it might not fit your analysis needs. You might need to identify these NoData cells so you can replace them with suitable values. How can you do that?
Identifying NoData Cells
There’s a tool called the Is Null tool that helps identify NoData cells. It checks each cell and gives a value of 1 if the cell is NoData, and 0 if it has any other value. If the resulting raster has both 0 and 1 values, the input raster has NoData cells. The Count field in the raster’s attribute table tells you how many cells of each value there are.
In the example below, the input raster on the left shows NoData cells in white. This raster has 9 rows and 10 columns, totaling 90 cells. There are four unique values (3, 9, 15, and 22). Summing the counts of each value (11, 22, 25, and 10) gives a total of 68 cells. Subtracting this from 90 shows there are 22 NoData cells. The output from the Is Null tool on the right shows the 1 values representing NoData, matching the expected count of 22 cells.
Factors to Consider in Analysis
Before diving into any analysis, it’s important to understand the problem you’re trying to solve. To ensure you follow the right analytical path and get the correct answer, carefully consider the types of analysis you will perform.
When it comes to filling NoData areas, here are some factors to think about:
- Where are the replacement values coming from?
- Is the raster discrete or continuous? Examples of discrete raster data include land use classes or ranks. Examples of continuous rasters include elevation and concentration.
- What is the nature of the NoData area to fill? Is it just a few scattered cells, or large areas? Some solutions work better on small areas, while others handle larger areas.
What to Replace NoData Cells With
To choose the right solution, first figure out where the replacement values are coming from. Common sources include:
- A specific numerical value
- Cell values from another raster
- The nearest cell
- A statistic of the surrounding values, like the average or the largest
Once you know the source of the replacement values, follow a specific workflow to get the result you want.
Workflows for Replacing NoData
The graphic below shows four general workflows for replacing NoData areas. The left column lists where the replacement values will come from. The center column shows the basic workflow for each scenario. The right column provides some information on how well the scenarios work, considering the size of the NoData area and the type of data.
A: Replace NoData Cells with a Specific Value
The simplest way to fill NoData cells is by replacing them with a specific value. By using the same value for all instances, you can apply it to all NoData cells without worrying about their size or distribution. This method works best for discrete data.
As mentioned earlier, you can use the Is Null tool to create a raster where a value of 1 indicates a NoData cell, and a value of 0 indicates any other cell.
The Con tool evaluates each cell of an input raster based on a logical condition. Here’s how to use it to replace NoData cells with a specific value:
- Set the raster with NoData as the Input Conditional Raster.
- In the Expression, set the Where clause to Value and select the is null option from the list.
- This uses the Is Null tool internally to identify NoData cells.
- Set the Input True Raster or Constant Value to the replacement value you want for NoData cells.
- Set the Input False Raster or Constant Value to the original input raster to keep those values in the output.
- Set the Output Raster location and name.
- Run the tool.
The illustration below shows NoData locations replaced with the new value of 2, while other locations kept their original values.
B: Replace NoData Cells with Values from a Different Raster
Instead of using a constant value, you can replace NoData cells with values from another raster. This works for both discrete and continuous data, but it’s best to match the type of the replacement raster to the one you’re updating.
Here’s how to use the Con tool to do this:
- Set the raster with NoData as the Input Conditional Raster.
- In the Expression, set the Where clause to Value and select the is null option from the list.
- Set the Input True Raster or Constant Value to the raster where the replacement values will come from.
- Set the Input False Raster or Constant Value to the original input raster to keep those values in the output.
- Set the Output Raster location and name.
- Run the tool.
Note: If the raster providing the replacement values has different properties (like extent, cell size, cell alignment, or coordinate system) than the raster containing the NoData cells, you’ll need to account for these differences. By default, the Con tool will use the union of the extents of the two input rasters and the maximum cell size. To preserve the existing cell values that aren’t NoData, make sure to set the Extent, Cell Size, and Snap raster environments to your original input raster.
The illustration below shows how values from the other raster replaced the NoData locations, while other locations retained their original values.
C: Replace NoData with the Value of the Nearest Spatial Neighbor
You might want to replace NoData cells with the value of the nearest cell. The Nibble tool can do this by replacing the input values for a defined area with the nearest value outside that area. This method is best for discrete data.
The Nibble tool needs two inputs: the raster where values at selected locations will be replaced with the nearest value, and a mask that identifies those locations. For this mask, NoData cells are within the mask, and any other value cells are outside the mask.
Here’s how to use the Nibble tool:
- Set the raster containing NoData as the Input Raster.
- Set the same raster as the Input Raster Mask.
- Set the Output Raster to the location and name.
- Uncheck the Use NoData Values If They Are the Nearest Neighbor parameter.
- This ensures only valid cells are considered for replacing NoData cells.
- Check the Nibble NoData Cells parameter.
- This will make the tool replace the NoData cells inside the masked area with the value of the nearest neighbor outside the masked area.
- Leave the Input Zone Raster parameter blank.
- Run the tool.
The illustration below shows how the value of the nearest input cell replaces the NoData locations. To make comparison easier, a dark red outline on the output raster identifies the NoData locations of the input raster. In case of ties, the output will be the lowest of the tied values. Small arrows identify the specific input cell providing the replacement value.
D. Replace NoData with a statistic calculated from the surrounding cells
A statistic calculated from the surrounding cells can replace a NoData cell.
This can be done by incorporating the neighborhood tool Focal Statistics into the analysis. For each input cell location, this tool calculates a statistic of the values within a specified neighborhood around it. You can specify a variety of neighborhood shapes, such as a rectangle, a circle, or a pie-shaped wedge, in whatever size you need. There are a variety of statistics to calculate, such as the average or minimum value.
The size of the areas of NoData is a consideration for this tool. For individual or small groups of NoData cells, the small default 3 by 3 cells neighborhood size can calculate the replacement value. To fill larger areas of NoData, either expand the size of the neighborhood, or run the process several times. For discrete data, the statistics that are most appropriate to use with this method are the maximum, minimum, most common, and least common. For continuous data, the mean statistic is typically the best one to use.
If run by itself, the Focal Statistics tool will calculate a statistic value for every cell in the input raster. To perform the calculations only on the NoData cells, we will apply the technique of using the Is Null tool within the Con tool to identify those locations. Then we will use the Focal Statistics tool to calculate a new value for those locations only.
To embed this tool in Con, it is necessary to create a complex expression in map algebra. In ArcGIS Pro, this can be done in the Raster Calculator tool or in the Python Window.
Apply the following workflow to run the Focal Statistics tool in the Python window:
- Open the Python window and import the necessary modules.
- Set the workspace to where your data is located.
- Begin to enter the map algebra expression to construct the statement for the Con tool.
- For the Input conditional raster parameter, enter “IsNull()” and specify the name of your input raster.
- For the Input true or constant value parameter, specify the necessary syntax for the Focal Statistics tool to calculate the output for the neighborhood and statistic of choice.
- Enter the name FocalStatistics, without a space.
- Set the Input raster to the raster you are processing.
- Set the Neighborhood parameter the shape and size of the neighborhood around the NoData cells you want to calculate the statistic for.
- Set the Statistics type parameter to the one you want to calculate.
- For the Input false or constant value, specify the original input raster again.
- Run the expression.
- As needed, set up and run the expression again to fill in large NoData areas.
- Since the result of the map algebra expression is a temporary raster object, use the Raster save method to persist the final output raster.
The following graphic shows an example of the syntax used to create a rectangular 3 by 3 cell focal neighborhood:
The following illustration shows how values from input raster replaced the NoData locations with the maximum value in a 3 by 3 cell neighborhood around them. In this case, the size of the NoData area is larger than the size of the focal processing window. This would cause some of the NoData locations to remain as NoData in the output from the focal operation, since there were no input values to do a calculation for. To resolve this, you can either run the operation again to replace the remaining NoData value, or use a larger neighborhood. This example ran the focal operation two times, with the output from the first pass used as input to the second.
The following is the syntax used in the Python command window to create the final output raster for this example:
import arcpy
from arcpy import env
env.workspace = "C:\project1"
OutFS1 = Con(IsNull('inRas.tif'),FocalStatistics('inRas.tif',NbrRectangle(3,3, 'CELL'), 'MAXIMUM'), 'inRas.tif')
OutFS2 = Con(IsNull('OutFS1'),FocalStatistics('OutFS1',NbrRectangle(3,3, 'CELL'), 'MAXIMUM'), 'OutFS1')
OutFS2.save("C:\Project1\outFmax.tif")
Summary
In this article, we touched on what NoData is, how to process it away, and some things to consider about the nature of your input. We then went over several solutions for replacing NoData cells in a raster using the functionality available in ArcGIS Spatial Analyst.
There are other scenarios that can have a similar objective. One example is to use an interpolation tool replace NoData cells in an elevation raster, with the goal of maintaining the local trends in the surround surface cells. Another is to first classify the nature of the NoData areas and then apply different workflows to each category sequentially to get a final product.
You can take the logic behind the workflows shown here and apply them to many applications in your own work.