# About samples and echo integration

It is important to understand the influence of samples on the calculation and reporting of the echo integral (ABC or NASC) and the subsequent estimation of biomass.

• Sampling and samples
• The use of echo integration to estimate the density and biomass of targets such as fish in a body of water.
• Sampling error and appropriate ways to deal with it.

We draw your attention in particular to the appropriate treatment of bad data when using echo integration to estimate the density and biomass of fish and/or other targets of interest.

## Sampling and samples

Sampling is needed when it’s not possible to measure everything. Sampling involves taking measurements from a subset of a statistical population and using them to estimate the characteristics of the whole population. A common example of sampling in hydroacoustic studies involves making echosounder measurements of fish density from a moving vessel along transect lines (a survey).
Samples are the measurements made while sampling. Echosounders and sonars measure time (since ping transmission) and voltage from the echo return, which they digitize into discrete samples that are representative of the echo intensity within a given volume of water.

## Echo integration, density and biomass

Echo integration is a widely-adopted and well-established technique for estimating aquatic target density and hence biomass from hydroacoustic echo-intensity samples (typically collected with a calibrated echosounder).

### Key quantities

The key quantities in echo integration (after MacLennan et al 2005) are:

### Calculating density and biomass

The volumetric density (ρv) and/or areal density (ρa) of a given group of targets in a specified volume of water be calculated from the total echo intensity as follows (after Acoustics Unpacked: Density, MacLennan et al 2005):

 1. individuals/m3. The equation may be re-expressed as 10 log10ρv = Sv - TS. 2. in individuals/m2. Alternatively,  individuals/nmi2. See also Density number and Density weight.

Option 1 provides the volumetric density of targets in 1 m3 of water. Where sv is the volume backscattering coefficient and σbs is the backscattering cross-section.

Option 2 provides an areal density for a volume of water with a specified surface area (typically 1m2 or 1nmi2 as described above) and a specified height (e.g. the whole water column, the top 20 m of the water column, the mean height of the fish school etc). Where direct measurements of σbs, σsp and/or TS are unavailable, modeled estimates are used ( <σbs>, <σsp>, <TS>). These estimates are determined by variously accounting for the properties of the target that influence their TS, namely their material properties, size, shape and orientation. The volumetric and areal densities are interchangeable if you know the column area and height.

The biomass (B in kg or t) of a given population of targets in a given body of water can then be estimated if you know:

1. The mean target density of the population.
2. The volume or area of the water body.

## Sources of error in hydroacoustic surveys

There are two main sources of error (uncertainty) in hydroacoustic surveys (after Demer 2004):

1. Measurement error
2. Sampling error

It is understood that measurement error is insignificant compared to sampling error due (typically) to the much larger number of measurements that are averaged (central limit theorem).
Sampling error arises because, by definition, samples are not available for the full extent of the population you wish to describe. For a hydroacoustic survey, this occurs due to:

1. No measurements between sampling transects.
2. Gaps in the measurements along sampling transects, brought about by:
1. Gaps between pings (related to ping rate and beam angle);
2. Volumes beyond the range of the instrument;
Note: For an aquatic survey there is an implicit but obvious reference to the water column as the boundary of the population being described. It is worth noting that hydroacoustic surveys also yield samples beyond the water column – typically samples below the bottom - and these can simple be excluded from analysis.

## General approaches for dealing with sampling error

Between-transect sampling error has been covered extensively in the literature. For hydroacoustic surveys, density values for the volumes between transects can be estimated from the sample data with a variety of statistical approaches, including block averaging (e.g. Jolly and Hampton 1990), geostatistics (e.g. Simmonds and Fryer 1996) and Bayesian probability (e.g. Brierley et al 2003).
Along-transect sampling error has been discussed much less in the hydroacoustic literature. In general you can deal with measurement gaps in one of two ways:

1. Group the good samples that you do have into time-based and/or space-based bins and calculate the mean value of those samples for each bin.
2. Estimate the values within the measurement gaps based on a model.

Option 1 assumes, in effect, that the measurement gaps within the bin being described have a value equal to the mean of the good samples in that bin. This is a robust assumption to make within the contexts of both central limit theorem and autocorrelation, especially when you have no further information available about the missing values (but see the Further considerations below).

Option 2 assumes that you have further information available about the values within the measurement gaps. A simple example might be a depth range for a target species, below which you know they do not live; volumes below this depth can therefore be confidently assigned a value of zero (empty water) in terms of the target species.

## Dealing with sampling error in Echoview

### Measurement gaps between transects

For the measurement gaps between transects, the typical approach is to apply statistical procedures outside of Echoview to the along-transect density measurements. Examples of such procedures and related citations are provided above.

### Measurement gaps along transects

For the measurement gaps along transects:

• Gaps between pings are generally assumed (often implicitly) to equate to the mean of the good samples within a given time/space bin. Explicit treatment of the volumes between pings does not appear to be a common approach (we’re not aware of any studies in this regard)
• Volumes beyond the range of the instrument are typically ignored. That is, nothing is estimated or reported regarding the content of those volumes
• Corrupt samples (bad data) can be identified and treated appropriately by:
• Implementing a custom algorithm via a dataflow of operators to replace those samples with modeled values

#### Note

Measurements made beyond the water column (below bottom for vertical soundings; below bottom and/or above the surface for horizontal soundings) are not sources of sampling error. They simply represent measurements to be excluded from analysis. The same is true for measurements made outside the scope of the survey design (e.g. during off-transect periods, during sampling stations, outside the depth layer of interest etc.).

### Samples and regions

Echoview integration and single target analysis are affected by sample and region analysis settings on the Analysis page of the Variable Properties dialog box. Visible bad region types and no-data samples may be handled in a number of ways.

Use applied Bad data (empty water) regions to deal with data you deem to have no acoustic return.

Use applied Bad data (no data) regions and effectively reduce Thickness mean calculations and reduce reported volume for NASC.

Use applied Bad data (no data) regions and the Include the volume of no data samples option to effectively convert no-data samples to the mean of the analysis domain. This gives you greater flexibility in calculating your NASC based results.

For detailed information refer to About analysis domains: Analysis settings and Integration cues and outcomes.

Other assumptions may be implemented with virtual variables that can partition data and replace sample values with an estimate. For more detailed information refer to Further considerations.

### Earlier Echoview versions

##### Echoview 4.90 and earlier

When performing an integration analysis (onscreen or for export), excluded samples did not contribute to the Sv mean and no-data samples have no sample interval (sample thickness).

No-data samples effectively reduced the value Thickness mean. Downstream analysis variables were affected such that the reported NASC (onscreen or in export data) is reported for the reduced volume. The user must decide if the NASC reported is representative of the analysis domain or whether another assumption needs to be made.

Analysis page settings to exclude samples. One or a combination of settings can be used.

 Exclude above line Exclude below line Exclude bad data regions Exclude pings where Line Status is Bad on the Exclude below line - And - Set Exclude below line to a user specified line

Note: When these settings are not selected, no samples are excluded. This means that the Sv mean and Thickness mean and NASC are calculated using all the samples in the analysis domain.

##### Echoview 4.90

Echoview 4.90 introduced a setting to deal with the special case where an entire ping is excluded from analysis. This setting is Whole excluded or no-data pings removed from ping count (this was known as Whole excluded or no-data pings do not reduce thickness mean in Echoview 4.90). This setting helps you to set up the conditions where NASC can represent the water volume you intended.

When performing an integration analysis using this setting, excluded samples:

• do not contribute to the Sv mean
• reduce the Thickness mean of the analysis domain except for rectangular domains where excluded samples make up whole pings and Whole excluded or no-data pings removed from ping count is selected; in this case excluded samples do not reduce the Thickness mean of the analysis domain.

The setting Whole excluded or no-data pings removed from ping count on the Analysis page can therefore help you to represent the water volume as intended in many cases (but refer to the Note below). It is recommended that you use this and other Exclusion settings on the Analysis page with care and ensure that you are aware of the effect a setting has on Thickness_mean.

Using the Whole excluded or no-data pings removed from ping count setting is equivalent to removing the pings from the data set – it is as if the echosounder had not pinged at that location. The virtual variable “Reduce pings” does exactly this – it removes empty pings from the data set and can be used to visualize the effect of this Analysis setting (even if you do not own a Advanced Operators module).

Note:

Be aware of the case where a whole excluded or no-data ping includes samples that are known to be out-of-bounds (e.g. samples below an exclude-below line).

In this case the Thickness mean for the domain may be over or under estimated depending on whether the exclude-below-line depth in the removed pings is greater than or less than the mean exclude-below-line-depth of the domain. If it can be assumed that whole excluded pings are randomly placed with respect to the depth of exclude-lines in domains (probably a reasonable assumption in most surveys) then the error from this source will average to zero over a multiple domains.

Analysis page settings to exclude samples and not reduce the Thickness mean.

 Whole excluded or no-data pings removed from ping count   Together with any or none of: Exclude above line Exclude below line Exclude bad data region   Exclude pings where Line Status is Bad on the Exclude below line - And - Set Exclude below line to a user specified line

##### Emulation of results under Echoview 5.3 "Whole excluded or no-data pings removed from ping count"

The Echoview 5.3 setting "Whole excluded or no-data pings removed from ping count" provided a partial solution for affected area density in the case of excluded samples that spanned a whole ping. To reproduce the effects of the superseded setting use the Reduce pings operator. This workaround is not very good for ping based interval grids.

Note: Echoview 5.4 offers an affected area density solution that considers samples in partial and/or full pings.

## Further considerations

### Strategies

In Standard operating procedures for fisheries acoustic surveys in the Great Lakes: 9.1.5 Noise Removal, Parker-Setter et al (2009) discuss the assumptions about fish density in bad-data regions, impacts on Sv and sa (area density - in Echoview this is ABC) and suggest how Echoview users may handle this with Echoview analysis export data.

Operators that help you partition data and replace sample values with an estimate include:

Operator Action

Bitmap operators

Bitmap operators include:

 Mask Applies a bitmap mask to data. Boolean Or, Not, And Applies a Boolean operation to a bitmap(s). Data range bitmap Applies a bitmap using a data range. Region bitmap Creates a bitmap of a region Select Uses a bitmap to select between two values

Data generator

Creates data pings. It can create pings with constant SV, TS, unspecified-dB and linear values. It can also create SV and TS pings with time-varied gain (TVG), based on the SV or TS value at 1 meter and a specified absorption coefficient.

Formula

Provides tools to create and evaluate an arbitrary mathematical equation on a per-sample basis on the echogram data.

Merge pings

Merges the pings in two variables to create a single variable containing the pings from both the input variables.

Ping subset

Divides up echogram data according to subsets of pings.

Processed data

Applies the line and bad data exclusion settings specified on the Analysis page of the Variable Properties dialog box for its input variable, and changes excluded sample values to 'no data' values.

Reduce pings

Remove pings which meet certain criteria.

Region statistic

Fills regions with a statistical value from another echogram.

Resample operators

Resamples echogram data by ping, distance or time interval.

### Future directions

The Echoview team is considering integration analysis enhancements that give you greater integration flexibility. These include (but are not limited to) the following ideas:

• Features to handle out of bounds samples.
• The ability to replace bad data with modeled data based on specified assumptions.

Footnotes:

1Thickness_mean or Height_mean. Where Height_mean is the projection of Thickness_mean onto the vertical axis, taking transducer geometry into account.