Let’s begin by loading the package.
# Load the package, and set a global random-number seed for the reproducible generation of fake data later on. library(pavo) set.seed(1612217)
The raw spectral data used in this example are available from the package repository on github, located here. You can download and extract it to follow the vignette exactly. Alternatively, the data are included as an RData file as part of the package installation, and so can be loaded directly (see below).
The data consist of reflectance spectra, obtained using Avantes equipment and software, from seven bird species: Northern Cardinal Cardinalis cardinalis, Wattled Jacana Jacana jacana, Baltimore Oriole Icterus galbula, Peach-fronted Parakeet Aratinga aurea, American Robin Turdus migratorius, and Sayaca Tanager Thraupis sayaca. Several individuals were measured (sample size varies by species), and 3 spectra were collected from each individual. However, the number of individuals measured per species is uneven and the data have additional peculiarities that should emphasize the flexibility
pavo offers, as we’ll see below.
pavo includes three datasets that can be called with the
data(flowers) will all be used in this vignette. See the help files for each dataset for more information; via
The first thing we need to do is import spectral data into R using the function
getspec(). It’s worth noting that
getspec() is simply a wrapper for
lr_get_spec() from the package
lightr, which is a more specialised and feature-rich package for the import of spectral data and metadata (Gruson, White, and Maia 2019). Since the example spectra were obtained using Avantes software, we will need to specify that the files have the
.ttt extension. Further, the data is organized in subdirectories for each species.
getspec() will search through subdirectories recursively, and may include the names of the subdirectories in the spectra name if desired.
getspec() also uses parallel processing thanks to the
future package. You can check the documentation (
?future::plan()) for more details but an easy way to set up parallel processing in most cases is to use the
plan("multiprocess") command. A final issue with the data is that it was collected using a computer with international numbering input, which means it uses commas instead of periods as a decimal separator. We can specify that in the function call.
If, for example, the raw spectral files were downloaded and placed in a directory called
/pavo/data_external/vignette, you might execute the following command and see this output:
specs <- getspec("~/pavo/data_external/vignette", ext = "ttt", decimal = ",", subdir = TRUE, subdir.names = FALSE) # 213 files found; importing spectra # |================================================================================| 100%, ETA 00:00
For convenience, however, we’ve included the spectra as an RData file in the package installation, and so will simply load it directly.
And we can inspect the resulting object:
specs[1:10, 1:4] #> wl cardinal.0001 cardinal.0002 cardinal.0003 #> 1 300 5.7453 8.0612 8.0723 #> 2 301 6.0181 8.3926 8.8669 #> 3 302 5.9820 8.8280 9.0680 #> 4 303 6.2916 8.7621 8.7877 #> 5 304 6.6277 8.6819 9.3450 #> 6 305 6.3347 9.6016 9.4834 #> 7 306 6.3189 9.5712 9.3533 #> 8 307 6.7951 9.4650 9.9492 #> 9 308 7.0758 9.4677 9.8587 #> 10 309 7.2126 10.6172 10.5396 dim(specs) # the data set has 213 spectra, from 300 to 700 nm, plus a 'wl' column #>  401 214
pavo imports spectra, it creates an object of class
rspec, which inherits attributes from the
If you already have multiple spectra in a single data frame that you’d like to use with
pavo functions, you can use the command
as.rspec() to convert it to an rspec object. The function will attempt to identify the wavelength variable or you can specify the column containing wavelengths with the
whichwl argument. The default way that
as.rspec() handles reflectance data is to interpolate the data in 1-nm bins, as is commonly done for spectral analyses. However, this can be turned off by using:
interp = FALSE. As an example, we will create some fake reflectance data, name the column containing wavelengths (in 0.5-nm bins) wavelength rather than wl (required for
pavo functions to work) and also put the column containing wavelengths third rather than first.
# Create some fake reflectance data with wavelength column arbitrarily titled # and not first in the data frame: fakedat <- data.frame( refl1 = rnorm(n = 801), refl2 = rnorm(n = 801), wavelength = seq(300, 700, by = .5) ) head(fakedat) #> refl1 refl2 wavelength #> 1 -0.032893386 0.5059612 300.0 #> 2 -0.478552738 -1.1526035 300.5 #> 3 -0.190687886 -1.0708952 301.0 #> 4 -0.008977959 -1.9871907 301.5 #> 5 -0.443133039 -0.3910143 302.0 #> 6 0.032110206 -0.5403221 302.5 is.rspec(fakedat) #>  FALSE fakedat.new <- as.rspec(fakedat) #> wavelengths found in column 3 #> The spectral data contain 443 negative value(s), which may produce unexpected results if used in models. Consider using procspec() to correct them. is.rspec(fakedat.new) #>  TRUE head(fakedat.new) #> wl refl1 refl2 #> 1 300 -0.03289339 0.50596119 #> 2 301 -0.19068789 -1.07089518 #> 3 302 -0.44313304 -0.39101435 #> 4 303 -0.52064707 0.84403232 #> 5 304 1.42813472 0.07150436 #> 6 305 -0.19044804 -0.84504226
As can be seen,
as.rspec() renames the column containing wavelengths, sets it as the first column, interpolates the data in 1-nm bins and converts the data to an
rspec object. Note that the same output is returned with specifying
whichwl = 3:
head(as.rspec(fakedat, whichwl = 'wavelength')) #> The spectral data contain 443 negative value(s), which may produce unexpected results if used in models. Consider using procspec() to correct them. #> wl refl1 refl2 #> 1 300 -0.03289339 0.50596119 #> 2 301 -0.19068789 -1.07089518 #> 3 302 -0.44313304 -0.39101435 #> 4 303 -0.52064707 0.84403232 #> 5 304 1.42813472 0.07150436 #> 6 305 -0.19044804 -0.84504226
lim argument allows you to specify the range of wavelengths contained in the input dataset. This is useful either in the case that the dataset doesn’t contain this information (and hence you cannot specify the column with
whichwl or automatically find the column with
as.rspec()). Additionally, it may be useful to focus on a subset of wavelengths. In our example, the wavelengths ranged from 300 to 700 nm, however you could also specify a restricted range of wavelengths with
fakedat.new2 <- as.rspec(fakedat, lim = c(300, 500)) #> wavelengths found in column 3 #> The spectral data contain 231 negative value(s), which may produce unexpected results if used in models. Consider using procspec() to correct them. plot(refl1 ~ wl, type = "l", data = fakedat.new2)
We want to stress that it is important to check the actual wavelengths contained in the data before setting this argument (
as.rspec() will warn you when wavelengths in the data are not present in the range specified with
as.rspec() will assume that wavelengths exist when in fact they may not. For example, if we set
lim = c(300, 1000) and plot the results, the reflectance values between 700 and 1000 nm are set to be equal since there is no information at these wavelengths in the original dataset:
fakedat.new2 <- as.rspec(fakedat, lim = c(300, 1000)) #> wavelengths found in column 3 #> Warning in as.rspec(fakedat, lim = c(300, 1000)): Interpolating beyond the range #> of actual data. Check 'lim' and `exceed.range` arguments to confirm this is the #> desired behaviour. #> The spectral data contain 743 negative value(s), which may produce unexpected results if used in models. Consider using procspec() to correct them. plot(fakedat.new2[, 2] ~ fakedat.new2[, 1], type = "l")
rspec object has been created, either by importing raw spectral data or converting a dataset with the
as.rspec() function, you can subset the spectra based on their names using a modified version of
subset function. For example, the following code illustrates how to create an
rspec object containing only tanagers:
specs.tanager1 <- subset(specs, "tanager") head(specs.tanager1)[1:5] #> wl tanager.0001 tanager.0002 tanager.0003 tanager.0004 #> 1 300 10.0618 10.6744 10.1499 13.7473 #> 2 301 11.1472 10.8054 9.8003 14.3102 #> 3 302 10.7819 10.6134 9.5607 14.4463 #> 4 303 11.0210 11.2037 10.4107 15.5533 #> 5 304 10.2177 11.2120 9.9452 14.3841 #> 6 305 11.5664 11.6135 10.8659 15.6445
subset function here is using partial matching to find all spectra with the string “tanager” in their name. To fully benefit from this flexible subsetting functionality, it is important that you follow a consistent file naming scheme. For example,
tanager.14423.belly.001.ttt would indicate the species (tanager), individual ID (14423), body patch (belly) and measurement number (001). Additionally, we suggest that labels used should have the same number of characters, which simplifies character string manipulation and subsetting based on partial matching.
If you prefer not to use partial matching,
subset will also work if you provide a logical condition, similar to the default
subset behaviour in
R. For example:
# extract first component of filenames containing species names spp <- do.call(rbind, strsplit(names(specs), "\\."))[, 1] # subset specs.tanager2 <- subset(specs, subset = spp == "tanager") # compare subsetting methods all.equal(specs.tanager1, specs.tanager2) #>  TRUE
subset will also work with visual model (class
vismodel) and colspace (class
colspace) objects, as described below.
Another useful function is
merge. Let’s say that you have subsetted spectra for tanagers and parakeets, and you would like to re-combine them for an analysis. The following lines of code show how to do this:
specs.tanager <- subset(specs, "tanager") specs.parakeet <- subset(specs, "parakeet") specs.new <- merge(specs.tanager, specs.parakeet)
Note that this re-combined file (
specs.new) has only a single
wl column with the merged spectra as columns. Keep in mind that the order of objects in
merge will determine the order of columns in the final merged object (i.e. tanagers will be before parakeets).
As previously described, our data (contained in the
specs object) consists of multiple individuals, and each was measured three times, as is common to avoid measurement bias. A good way to visualize the repeatability of our measurements is to plot the spectra of each individual separately. The function
explorespec() provides an easy way of doing so. You may specify the number of spectra to be plotted in the same panel using the argument
specreps, and the function will adjust the number of panels per page accordingly. We will exemplify this function using only the 12 cardinal individuals measured:
So our first step would be to take the average of each of these three measurements to obtain average individual spectra to be used in further analyses. This is easily accomplished using the
aggspec() function. The
by argument can be either a number (specifying how many specs should be averaged for each new sample) or a vector specifying the identities of the spectra to be combined (see below):
mspecs <- aggspec(specs, by = 3, FUN = mean) mspecs[1:5, 1:4] #> wl cardinal cardinal.1 cardinal.2 #> 1 300 7.292933 5.676700 6.387233 #> 2 301 7.759200 5.806700 6.698200 #> 3 302 7.959333 5.858467 6.910500 #> 4 303 7.947133 6.130267 7.357567 #> 5 304 8.218200 6.127933 7.195267 dim(mspecs) # data now has 71 spectra, one for each individual, and the 'wl' column #>  401 72
Now we’ll use the
aggspec() function again, but this time to take the average spectrum for each species. However, each species has a different number of samples, so we can’t use the
by argument as before. Instead we will use regular expressions to create a species name vector by removing the numbers that identify individual spectra:
# create a vector with species identity names spp <- gsub('\\.[0-9].*$', '', names(mspecs))[-1] table(spp) #> spp #> cardinal jacana oriole parakeet robin tanager #> 12 9 9 13 10 18
Instead, we are going to use the
spp vector we created to tell the
aggspec() function how to average the spectra in
sppspec <- aggspec(mspecs, by = spp, FUN = mean) round(sppspec[1:5, ], 2) #> wl cardinal jacana oriole parakeet robin tanager #> 1 300 7.05 7.33 3.89 7.63 3.98 9.02 #> 2 301 7.25 7.35 3.91 7.75 3.91 9.53 #> 3 302 7.44 7.45 4.13 7.89 4.19 9.41 #> 4 303 7.82 8.09 4.39 8.49 4.51 10.20 #> 5 304 7.84 7.71 4.18 8.66 4.07 9.68
Data obtained from spectrometers often requires further processing before analysis and/or publication. For example, electrical noise can produce unwanted “spikes” in reflectance curves. The
procspec() can handle a variety of processing techniques. For example, the reflectance curve from the parakeet is noisy in the short (300-400 nm) and long (650-700 nm) wavelength ranges (see Figure below, black line). To eliminate this noise, we will use local regression smoothing implemented by the
loess.smooth() function in
R, wrapped in the
opt="smooth" argument of
But first, let’s use the
plotsmooth() function to determine a suitable smoothing parameter (
span). This function allows you to set a minimum and maximum smoothing parameter to try and plots the resulting curves against the unsmoothed (raw) data in a convenient multipanel figure.
From the resulting plot, we can see that
span = 0.2 is the minimum amount of smoothing to remove spectral noise while preserving the original spectral shape. Based on this value, we will now use the
opt argument in
procspec() to smooth data for further plotting and analysis.
spec.sm <- procspec(sppspec, opt = "smooth", span = 0.2) #> processing options applied: #> smoothing spectra with a span of 0.2 plot(sppspec[, 5] ~ sppspec[, 1], type = "l", lwd = 10, col = "grey", xlab = "Wavelength (nm)", ylab = "Reflectance (%)" ) lines(spec.sm[, 5] ~ sppspec[, 1], col = "red", lwd = 2)
We can also try different normalisations. Options include subtracting the minimum reflectance of a spectrum at all wavelengths (effectively making the minimum reflectance equal to zero,
opt = "min", left panel, below) and making the reflectance at all wavelength proportional to the maximum reflectance (i.e. setting maximum reflectance to 1;
opt = "max", centre panel, below). Note that the user can specify multiple processing options that will be applied sequentially to the spectral data by
procspec() (right panel, below).
# Run some different normalisations specs.max <- procspec(sppspec, opt = "max") #> processing options applied: #> Scaling spectra to a maximum value of 1 specs.min <- procspec(sppspec, opt = "min") #> processing options applied: #> Scaling spectra to a minimum value of zero specs.str <- procspec(sppspec, opt = c("min", "max")) # multiple options #> processing options applied: #> Scaling spectra to a minimum value of zero #> Scaling spectra to a maximum value of 1
# Plot results par(mfrow = c(1, 3), mar = c(2, 2, 2, 2), oma = c(3, 3, 0, 0)) plot(specs.min[, 5] ~ c(300:700), xlab = "", ylab = "", type = "l") abline(h = 0, lty = 2) plot(specs.max[, 5] ~ c(300:700), ylim = c(0, 1), xlab = "", ylab = "", type = "l") abline(h = c(0, 1), lty = 2) plot(specs.str[, 5] ~ c(300:700), type = "l", xlab = "", ylab = "") abline(h = c(0, 1), lty = 2) mtext("Wavelength (nm)", side = 1, outer = TRUE, line = 1) mtext("Normalised reflectance (%)", side = 2, outer = TRUE, line = 1)
Another intended usage of
procspec() is preparation of spectral data for dimensionality reduction (for example, using Principal Component Analysis, or PCA). Following Cuthill et al. (1999), we can use
opt = 'center' to centre spectra to have a mean reflectance of zero (thus removing brightness as a dominant variable in the PCA) and then bin the spectra into user-defined bins (using the
opt = 'bin' argument) to obtain a dataframe suitable for the PCA.
# PCA analysis spec.bin <- procspec(sppspec, opt = c("bin", "center")) #> processing options applied: #> Centering spectra to a mean of zero #> binned spectra to 21-nm intervals head(spec.bin) spec.bin <- t(spec.bin) # transpose so wavelength are variables for the PCA colnames(spec.bin) <- spec.bin[1, ] # names variables as wavelength bins spec.bin <- spec.bin[-1, ] # remove 'wl' column pca1 <- prcomp(spec.bin, scale. = TRUE)
summary(pca1) #> Importance of components: #> PC1 PC2 PC3 PC4 PC5 PC6 #> Standard deviation 3.6016 1.9885 1.5791 0.6678 0.36656 4.351e-16 #> Proportion of Variance 0.6486 0.1977 0.1247 0.0223 0.00672 0.000e+00 #> Cumulative Proportion 0.6486 0.8463 0.9710 0.9933 1.00000 1.000e+00
As can be seen by the summary, PC1 explains approximately 64% of the variation in spectral shape and describes the ratio of short to long wavelengths reflected. The flexibility of
pavo’s plotting capabilities allows you to sort spectra by another variable (e.g., PC1 loading) and then plot in a stacked format using the
# Generate colours from spectra colr <- spec2rgb(sppspec) wls <- as.numeric(colnames(spec.bin)) # Rank specs by PC1 sel <- rank(pca1$x[, 1]) sel <- match(names(sort(sel)), names(sppspec)) # Plot results par(mfrow = c(1, 2), mar = c(2, 4, 2, 5), oma = c(2, 0, 0, 0)) plot(pca1$rotation[, 1] ~ wls, type = "l", ylab = "PC1 loading") abline(h = 0, lty = 2) plot(sppspec, select = sel, labels.stack = names(sppspec)[sel], type = "s", col = colr) mtext("Wavelength (nm)", side = 1, outer = TRUE, line = 1)
Negative values in spectra are unwanted, as they are uninterpretable (how can there be less than zero light reflected by a surface?) and can affect estimates of colour variables. Nonetheless, certain spectrometer manufacturers allow negative values to be saved. To handle negative values, the
procspec() function has an argument called
fixneg. The two options available are (1) adding the absolute value of the most negative value to the whole spectrum with
addmin, and (2) changing all negative values to zero with
# Create a duplicate spectrum and add some negative values refl <- sppspec[, 7] - 20 testspecs <- as.rspec(cbind(c(300:700), refl)) #> wavelengths found in column 1 #> The spectral data contain 188 negative value(s), which may produce unexpected results if used in models. Consider using procspec() to correct them. # Apply two different processing options testspecs.fix1 <- procspec(testspecs, fixneg = "addmin") #> processing options applied: #> Negative value correction: added min to all reflectance testspecs.fix2 <- procspec(testspecs, fixneg = "zero") #> processing options applied: #> Negative value correction: converted negative values to zero
# Plot it par(mar = c(2, 2, 2, 2), oma = c(3, 3, 0, 0)) layout(cbind(c(1, 1), c(2, 3)), widths = c(2, 1, 1)) plot(testspecs, select = 2, ylim = c(-10, 30)) abline(h = 0, lty = 3) plot(testspecs.fix1, select = 2, ylim = c(-10, 30)) abline(h = 0, lty = 3) plot(testspecs.fix2, select = 2, ylim = c(-10, 30)) abline(h = 0, lty = 3) mtext("Wavelength (nm)", side = 1, outer = TRUE, line = 1) mtext("Reflectance (%)", side = 2, outer = TRUE, line = 1)
These manipultions may have different effects on the final spectra, which the user should keep in mind and use according to the final goal of the analysis. For example, by adding the minimum reflectance to all other wavelength, the shape of the curve is preserved, but the maximum reflectance is much higher. On the other hand, substituting negative values with zero preserves absolute reflectance values, but may cause the spectral shape to be lost. The “best” transformation will depend on the severity of the problem of negative values and the goal of the analysis (e.g. will reflectance intensity be used? What is more important, to preserve reflectance values or the total shape of the curve?). Which correction to use would also depend on the source of the negative values. If they are thought to originate from improper calibration of the spectrophotometer,
fixneg = addmin would be appropriate. However, if they are thought to originate from electric noise,
fixneg = zero would be more appropriate.
pavo offers three main plotting functions. The main one is
plot, which combines several different options in a flexible framework for most commonly used purposes. The
explorespec() function aims at providing initial exploratory analysis, as demonstrated in Section 1. Finally,
aggplot() provides a simple framework for publication-quality plots of aggregated spectral data.
pavo uses the class
rspec to identify spectral data, the function
plot.rspec() can be called simply by calling
plot(data). If the object is not of class
rspec the multivariate visualisation methods will not work as expected, so it might be useful to check the data using
is.rspec() and convert with
as.rspec() if necessary.
We have implemented three methods of visualizing spectral data using
These options are in addition to the exploratory plotting offered by
explorespec(), as seen in the figures in section 1. To showcase the capabilities of
plot.rspec(), we will use the
teal dataset included in
pavo. This dataset consists of reflectance spectra from the iridescent wing patch of a green-winged teal (Anas carolinensis). Reflectance measurements were taken between 300 and 700 nm at different incident angles, ranging from 15° to 70° (in 5° increments) (Eliason and Shawkey 2012).
We can start out by visualizing these spectra with the
overlay option in plot. Another neat option
pavo offers is to convert reflectance spectra to their approximate perceived colour, by using the function
spec2rgb(). This can make for some very interesting plots and even exploratory data analysis, as shown above.
Another option is the
stack plot (again, with human vision approximations of the colour produced by the spectra using
teal.norm <- procspec(teal, opt = c("min", "max")) #> processing options applied: #> Scaling spectra to a minimum value of zero #> Scaling spectra to a maximum value of 1 par(mfrow = c(1, 2), mar = c(2, 2, 2, 2), oma = c(2, 2, 0, 0)) plot(teal, type = "s", col = spec2rgb(teal)) plot(teal.norm, type = "s", col = spec2rgb(teal)) mtext("Wavelength (nm)", side = 1, outer = T, line = 1) mtext("Cumulative reflectance (A.U.)", side = 2, outer = T, line = 1)
Note that in the above figure, the y axis to the right includes the index of each spectrum. This makes it easier to identify and subset specific spectra or groups of spectra using the
select argument in
plot.rspec(). Note also that the first index is actually 2, preserving the sequence in the original dataset (since the first column is wavelength). Though this may seem confusing at first (“why is my first spec number 2?”) this preserves subsetting hierarchy: using
plot(teal, select = 2) will show the same spectra that would be selected if you use
Since this dataset is three-dimensional (containing wavelengths, reflectance values and incident angles) we can also use the
heatmap function. First, it will be necessary to define a vector for the incident angles each spectrum was measured at:
Next, we will smooth the data with
procspec() and plot as a false colour map (heatmap):
# Smooth the spectral data teal.sm <- procspec(teal, opt = c("smooth")) #> processing options applied: #> smoothing spectra with a span of 0.25 # Plot it as a heatmap plot(teal.sm, type = "h", varying = angles, ylab = expression(paste("Incident angle (", degree, ")")), las = 1, useRaster = TRUE )
These plots can be very useful to observe changes over time, for example, or any other type of continuous variation.
aggplot() has a very similar interface to
aggspec(), allowing for quick plotting of aggregated spectra combined by a factor, such as species, sex, experimental treatment, and so on. Its main output is a plot with lines of group mean spectra outlined by a shaded area indicating some measure of variability, such as the standard deviation of the group. Note that functions that aren’t already implemented in R must be passed like they would be to functions such as
function(x) sd(x)/sqrt(length(x)) in the example below).
par(mfrow = c(1, 2), mar = c(4, 4, 2, 2), oma = c(2, 0, 0, 0)) # Plot using median and standard deviation, default colours aggplot(mspecs, spp, FUN.center = median, ylim = c(0, 70), alpha = 0.3, legend = TRUE) # Plot using mean and standard error, in greyscale aggplot(mspecs, spp, ylim = c(0, 70), FUN.error = function(x) sd(x) / sqrt(length(x)), lcol = 1, shadecol = "grey", alpha = 0.7)
We first import our images with
getimg(), which functions in a similar manner to
getspec(). Since we are importing more than one image, we can simply point the function to the folder containing the images, and it will use parallel processing (where possible) to import any jpg, bmp, or png images in the folder. As with the raw spectra, these images are also available at the package repository [here][data-location], but are included in the package installation for convenience.
butterflies <- getimg(system.file("testdata/images/", package = 'pavo')) #> 2 files found; importing images.
pavo imports images, it created a multidimensional object of class
rimg, which inherits from the
array class. If more than one image is imported, then these
rimg objects are stored as elements in a list. All image-based functions in
pavo are capable of working with such
rimg lists, which allows for a more convenient and tidy high-throughput workflow. Note that while
pavo will allow you to load, manipulate, and analyse as many images as your computer’s memory allows, it will throw a message if the total size of images in memory is greater than ca. 200mb, as this will result in noticeably slowed performance (everything will still work, it will just be slow). This may be ameliorated by reducing the size of individual images, such as through
procimg(), or by processing images in smaller batches. All
rimg objects have a number of identifying attributes, drawn on by image-processing functions, which can be readily inspected.
is.rimg(butterflies) #>  TRUE str(butterflies[]) #> 'rimg' num [1:500, 1:340, 1:3] 1 1 1 1 1 1 1 1 1 1 ... #> - attr(*, "state")= chr "raw" #> - attr(*, "imgname")= chr "h_melpomene" #> - attr(*, "px_scale")= logi NA #> - attr(*, "raw_scale")= logi NA #> - attr(*, "k")= logi NA #> - attr(*, "outline")= logi NA #> - attr(*, "colnames")= logi NA #> - attr(*, "tag_loc")= logi NA str(butterflies[]) #> 'rimg' num [1:500, 1:398, 1:3] 0.988 1 1 1 1 ... #> - attr(*, "state")= chr "raw" #> - attr(*, "imgname")= chr "papilio" #> - attr(*, "px_scale")= logi NA #> - attr(*, "raw_scale")= logi NA #> - attr(*, "k")= logi NA #> - attr(*, "outline")= logi NA #> - attr(*, "colnames")= logi NA #> - attr(*, "tag_loc")= logi NA
If you already have images loaded into R, you can convert them to
rimg objects using
as.rimg(). The function will attempt to interpret a multidimensional array as an RGB image with values ranging between [0, 1], and imbue it with the custom attributes of an
rimg. We can see this by creating a fake array, before converting it.
fakeimg <- array(c( matrix(c(1, 1, 0, 0), nrow = 12, ncol = 8), matrix(c(0, 0, 0, 0), nrow = 12, ncol = 8), matrix(c(0, 0, 1, 1), nrow = 12, ncol = 8)), dim = c(12, 8, 3) ) fake_rimg <- as.rimg(fakeimg) is.rimg(fake_rimg) #>  TRUE str(fake_rimg) #> 'rimg' num [1:12, 1:8, 1:3] 1 1 0 0 1 1 0 0 1 1 ... #> - attr(*, "state")= chr "raw" #> - attr(*, "imgname")= chr "img" #> - attr(*, "px_scale")= logi NA #> - attr(*, "raw_scale")= logi NA #> - attr(*, "k")= logi NA #> - attr(*, "outline")= logi NA #> - attr(*, "colnames")= logi NA #> - attr(*, "tag_loc")= logi NA
Thanks to the underlying class system of
pavo, the generic functions
summary recognise and respond to image-data in an appropriate manner, as is the case with spectral data. A quick call to
plot, for example, prints our images. Note that while we call an image individually below (for convenient printing in the vignette), we could simply feed the entire list of images to
plot and it would automatically step through each image in the list upon user input.
summary function offers some summary information about images or lists of images as we might expect. If we specify
plot = TRUE however, the function instead plots the image. When such images have been colour-classified this also plots the images’ colour palette alongside (see below). This is an extremely useful diagnostic tool when classifying images, as it allows us to see how well the classification algorithm has performed in clustering colour pattern elements into discrete colour categories. As with all plots, the function takes all standard plot arguments (see
?par), such as custom colours, which again can be diagnostically useful.
Images often require some post-capture processing, depending on the intended use, and
pavo offers some useful options via
procimg(). The functionality of
procimg() is currently limited to only that which is closely relevant to the analyses currently implemented, as we assume any major processing and quality-checking (e.g. image rotation and colour correction) is undertaken by the user beforehand. However, should users with to keep their image processing within the R ecosystem, there now exist several excellent image-processing packages such as
pavo can seamlessly work with. Indeed, the functions
rimg2magick() convert images from
pavo preferred structure to that used by
imager (with the reverse simply achieved via
as.rimg()), which simplifies the transitions between these packages.
The functionality of
procimg() is currently centred on rotating and resizing images, setting a real-world image scale, and/or distinguishing the focal object from its background. Rotating an image simply requires inputting an angle, in degrees, by which the image is rotated about its centre, while resizing requires a scaling factor by which the size of image(s) are to be decreased or increased. Setting scales and distinguishing backgrounds, in contrast, are both interactive procedures, meaning that we are presented with the focal image and must select points on it using the mouse (either a line, when setting a scale, or a polygon when identifying the focus). If setting a scale, we specify a scale in our preferred units via
scaledist, and then click the endpoints of the scale within the image (which could be a ‘formal’ scale, or simply the length of a wing or another known reference object). This information, as usual, is attached to the image as an attribute.
# Interactively specify the real-world scale of the image. Here 100 mm. butterflies <- procimg(butterflies, scaledist = 100)
Distinguishing a focal object from its background is important in cases where we are analysing a broader visual scene, particularly when the background is highly heterogeneous, such as an animal in nature, and cannot be classified into a homogeneous block and identified more simply (see below). To define the object and background we specify
outline = TRUE, which will result in being asked to click around the outline of the focal object in each image, which is then converted to a polygon — the xy coordinates of which are saved as an attribute. It is often useful to slightly smooth the resulting polygon, which can be done with Chaikin’s corner-cutting algorithm by specifying a number of
refinements. Chaikin’s algorithm works by iteratively replacing every point by two new points 1/4 of the way to the next. The argument
iterations specifies the number of corner cutting iterations to apply, and the default value of 1 offers a subtle smoothing effect.
# Interactively specify a smoothed polygon around the focal objects butterflies <- procimg(butterflies, outline = TRUE, iterations = 1)
All of these options can be simultaneously specified in a single call to
procimg(), and the
pavo 2.0 publication offers a useful example of such processing, since the more involved features are difficult to illustrate here due to their interactive nature.
Cuthill, Innes C., Andrew T. D. Bennett, J. C. Partridge, and E. J. Maier. 1999. “Plumage Reflectance and the Objective Assessment of Avian Sexual Dichromatism.” The American Naturalist 153 (2): 183–200. https://doi.org/10.1086/303160.
Eliason, Chad M., and Matthew D. Shawkey. 2012. “A Photonic Heterostructure Produces Diverse Iridescent Colours in Duck Wing Patches.” Journal of the Royal Society Interface 9 (74): 2279–89. https://doi.org/10.1098/rsif.2012.0118.
Gruson, Hugo, Thomas White, and Rafael Maia. 2019. “Lightr: Import Spectral Data and Metadata in R.” Journal of Open Source Software 43: 1857. https://doi.org/10.21105/joss.01857.