A
current project with EPA (Environmental Protection Agency) is looking at the
use of space-time variograms to detect vegetation change in a ten-year period
using data from the Oregon Pilot Study area. The project will use NDVI
(Normalized Vegetation Index) satellite data. The Normalized
Difference Vegetation Index (NDVI) is widely used in a variety of biospheric
and hydrologic studies. For instance the NDVI plays an important role in the
soil moisture mapping conducted in SGP 99. This project will advance the
science of ecological monitoring and demonstrate techniques for regional-scale
assessment of the condition of aquatic resources in the western
The
variogram is used (especially in geostatics) to quantify spatial correlation,
i.e., similarity or dissimilarity in a statistical sense as a function of
separation distance and direction. The variograms are analogous to the
(auto)covariance function or (auto) correlation function except that they
exists under weaker conditions. For a function defined on n-dimensional
Euclidean space to be valid variogram it must satisfy certain conditions, e.g.,
the growth rate must be less than quadratic and it must be conditionally
negative definite. The second condition is not an easy one to check for a
particular function hence the practice is to use known valid models or positive
linear combinations (the class of valid models is closed under positive linear
combinations). When extending the variogram into space-time there are two
general approaches that might be used, one is to treat space-time as simply a
higher dimensional Euclidean space and the second is to “separate” space and
time. The disadvantage to the first approach is that it means one must have a
metric or norm on space-time, which essentially means that time as a
“dimension” is not really different than other Euclidean dimensions, which
contradicts some of the usual perceptions of time. The second approach is
essentially the same as that of constructing a valid model in n-dimensional
space from two models, one valid on k-dimensional space and the other on –k
dimensional space (for space-time use k=1).
The NDVI is the
difference of near-infrared (channel 2) and visible (channel 1) reflectance
values normalized over the sum of channels 1 and 2 .It is based on the principle that
actively growing green plants strongly absorb radiation in the visible region
of the spectrum (the ‘PAR’ or ‘Photosynthetically Active Radiation’) while
strongly reflecting radiation in the Near Infrared region. The concept of
vegetative ‘spectral signatures (patterns)’ is based on this principle. Given
the following abbreviations:
PAR Value of
Photosynthetically Active Radiation from a pixel
NIR Value of
Near-Infrared Radiation from a pixel
The NDVI for a pixel is calculated from the following formula:
NIR - PAR
NDVI = ---------------------
NIR + PAR
This formula yields a
value that ranges from -1 (usually water) to +1 (strongest vegetative growth.) where increasing positive values indicate increasing green vegetation
and negative values indicate non-vegetated surface features such as water,
barren, ice, snow, or clouds. The NDVI can be derived at several points in the
processing flow. To retain the most precision, the NDVI is derived after
calibration of channels 1 and 2, prior to scaling to byte range. Computation of
the NDVI must precede geometric registration and resampling to maintain
precision in this calculation.
To scale the computed NDVI results to byte data range, the NDVI computed value, which ranges from -1.0 to 1.0, is scaled to the range of 0 to 200, where computed -1.0 equals 0, computed 0 equals 100, and computed 1.0 equals 200. As a result, NDVI values less than 100 now represent clouds, snow, water, and other non-vegetative surfaces and values equal to or greater than 100 represent vegetative surfaces.
To
monitor vegetation response, NDVI data can be used to determine the greenness
overtime. NDVI is presumably determined from cloud free AVHRR observations. The
composite daily AVHRR values are taken to determine the biweekly AVHRR cloud
free data. NDVI data were calculated and scaled to range from zero to 200 to
represent a
terrestrial feature on land . Prior to the
analysis, the NDVI data were inspected and found that some outliers and they
were for example, low (<105) or equal to 200.Information on climate were
used to verify the data and the condition of the day of the observation.
Cleaning the data is, therefore, a necessary step before any analysis and
inferences. Low NDVI values (80-105) could be as results of snow, inland water
bodies, exposed soils, and dust (Eastman and Fulk, 1993; Myneni et al., 1997).
Values greater than 200 were
probably as a default and therefore, were excluded from the data. The
differences between two consecutive days were also examined. In a study to
separate dust storm and cloud effect on NDVI that used for drought effect in
Burkina Faso, Groten (1993) indicated that if an NDVI value is less than that
of preceding day by more than
10%, then this is a dust storm effect. He used an
algorithm to substitute for the value that was in a dust storm day. We used his
algorithm and substitute for low NDVI values when there are differences between
consecutive NDVI values of ≥20 as
follows:
Time |
NDVI |
NDVI, used |
Calculation |
|
||
1 |
163 |
163 |
|
|
|
|
2 |
139 |
163.667 |
|
163+(165-163)/3=163.667 |
||
3 |
142 |
164.333 |
|
(163.667+165)/2=164.333 |
||
4 |
165 |
165 |
|
|
|
|
5 |
158 |
158 |
|
|
|
|
6 |
165 |
165 |
|
|
|
|
7 |
141 |
163.5 |
|
(163+164)/2=163.5 |
|
|
8 |
164 |
164 |
|
|
|
|
9 |
150 |
165 |
|
|
|
|
The
cleaned NDVI data were within acceptable range (105 < NDVI < 200) and the
absolute value of the difference between any
consecutive values is less than 20.
Before
cleaning, the NDVI data file looks like the following-
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999
-9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 153 160 155 158 158 158 155 149 147 147 144 144 152 156
153 152 146 139 148 151 147 135 130 140 141 137 136 137 139 148 152 150 146 138
144 147 147 147 147 144 147 144 142 150 150 147 147 151 149 135 140 137 137 155
155 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9
-9999
is the default reading.
For cleaning the NDVI data a software program
written in FORTRAN language was used. The FORTRAN program was tested for its
accuracy, reliability and efficiency. The FORTRAN program was also transformed
into C anticipating that it would be more efficient and less time consuming.
After running it in the supercomputer, the FORTRAN code seemed to be more
efficient. Cleaning the data takes a long time in an average computer for its
huge volume. Sometimes it takes months just to clean the data. So in order to
handle this large data file the use of supercomputer is a necessity. Then we run the program for each year of data
at a time in the supercomputer. The supercomputer took about 2 days at a
stretch to complete the job. After cleaning a mean value has been assigned for
each pixel. The data are arranged in the following way-
NDVI yearly mean (year 1989)
Xcoord Ycoord MeanValue
-1870500. 36500.00 -9999.000
-1870500. 37500.00 -9999.000
-1870500. 38500.00 -9999.000
-1870500. 39500.00 -9999.000
-1870500. 40500.00 -9999.000
-1870500. 41500.00 -9999.000
-1870500. 42500.00 -9999.000
-1870500. 43500.00 -9999.000
-1870500. 44500.00 -9999.000
-1870500. 45500.00 -9999.000
-1870500. 46500.00 -9999.000
-1870500. 47500.00 -9999.000
-1870500. 48500.00 -9999.000
-1870500. 49500.00 -9999.000
The next thing we are going to do is to merge all
the folders of one year data into a big folder. Right now we are testing the
program which will do this job. After this we will start some statistical
analysis and use some decision making tool (space time variogram) to reach a
conclusion about the vegetation change.
So far we have dealt with the information. We have
organized or reorganized the data files to refine it. Now the data files are
ready for computation and analysis. Every data is associated with four
variables; easting, northing, time and pixel value. The data are arranged under
these four columns in the data files. The next thing is to apply the
statistical tool to find out the average difference between two points on a
space fixing the time variable constant. A vector having distance and direction
will indicate the difference between two points. The distance vector will be
assigned an angle either of 0-45, 45-90 or 90-135 degrees to get the direction.
A computer program will continually do the job of comparing two pairs of
points. It will take the distance and direction and search all the rows of the
data file in order to find out similarity and dissimilarity between two pairs
of points. We are interested about the nature of the computational difference
among those points. If the two points are closer, a smaller difference will be
anticipated. However if the points are far apart, the calculated difference
would be greater.
Similar statistical computation will be done with
respect to time keeping the space variable constant. We will calculate the
average difference between two points with the change of time. The difference
will be plotted on a graph with respect to change of year.
The nature of the graph will reveal if there
is any change in the data over time. On the other hand it will give us some
understanding about the change of vegetation over time.
We
have got ten layers of data, i.e., data of ten years. So these computations
will be done for each layer or for all the layers at a time.