# Sampling

A representative sample is needed to ensure the inspection results reflect the quality and condition of the lot being inspected. A representative sample is a sample that is obtained by official personnel, maintained under the control of official personnel, and is obtained following procedures established in the directives and handbooks issued by the Federal Grain Inspection Service.

Sampling procedures are defined for stationary lots using probe samples; or samples obtained from the grain stream during the loading or unloading process by diverter-type, mechanical, automatic Ellis cup, and pelican samplers.

There are sampling procedures for both bulk and packaged lots.

## What is a Representative Sample?

In grain inspection, the term "representative" is used to describe the sample taken with the GIPSA sampling procedures. The GIPSA instructions describe systematic procedures for taking samples from static lots and from lots during conveyance. The partitioned probe is used to take grain samples from static lots. Lots are probed multiple times using prescribed patterns. The diverter sampler is the primary sampling device for sampling grain lots while the lots are being conveyed. The diverter takes a periodic cut from a grain stream. The composite of the probes or the diverter cuts from a lot become the representative sample for the lot.

In statistical terms, the desired sample from a lot is a random sample. A simple random sample is one selected in a process where every possible sample has an equal chance of being selected. Probability theory can describe the variability of random samples. In lots, such as packaged products, where all items in the lot can be uniquely identified, a random sample is possible using a random number generator. However, the unique identification of the items in a lot is not always practical. A bulk grain lot is an example where the unique identification of each kernel in the lot is not practical.

Where random samples are not practical, a sampling procedure known as systematic sampling is often used. To illustrate systematic sampling, suppose that a sample of 10 items is to be taken from a lot of 2000 items. This is a rate of selecting one item for every 200 items in the lot. A random number between 1 and 200 is obtained from a random number generator, say 129. The systematic sample is selected by counting off to item 129 and selecting this item and then selecting every 200th item thereafter. The sample is taken by going through the lot systematically and selecting items on equal intervals. The diverter sample is very similar to the systematic sample. The diverter repeatedly cuts the grain stream throughout the time that a lot is moved. The cuts are separated by equal time intervals.

Systematic sampling is also known as increment sampling.

Systematic samples are usually assumed to be equivalent to random samples. If kernels are randomly distributed in the lot, the systematic sample is a random sample. When kernels are randomly distributed throughout the lot, almost any method of taking a sample will produce a random sample. Concern arises when the lot may not be thoroughly mixed. Increment sampling is one intuitive means of addressing the concerns of non-uniformity through the lot. Taking small portions throughout the lot should improve the chances of sampling from areas of high and low concentration of a characteristic.

In statistical terms, when segments of a lot have different concentrations of a characteristic, this is known as stratification. Stratification can be viewed as layers in a barge or a silo. Another visualization of stratification is the contents of railcars coming down a conveyor belt in sequence.

Increment and random sampling produce samples that give unbiased estimates of the lot content. A sampling procedure is unbiased if the average of all possible sample estimates is the same as the lot content. The issue with increment sampling is typically concerning the variability of the estimates. In particular, what happens to the variability of increment samples when the kernels are not randomly distributed throughout the lot?

A simulation example will illustrate some of the concerns with variability. The lot is constructed of 10 equal size strata. Five of the strata have zero content and the other five strata have a 2% average content. The overall lot contains 1% concentration of some characteristic such as damaged kernels, wheat of other classes, or biotech kernels. The simulation will assume that the lot is a succession of strata passing a diverter sampler. Figure 1 gives the lot content as the lot passes some point, such as the diverter sampler.

Figure 1. Strata Concentrations

The simulations will repeatedly sample the lot with a diverter type sampler, taking incremental cuts throughout the lot. The number of increments will be varied from 4 to 50. Figure 2 gives an example of what 50 increments may look like if each increment was measured. In practice, the individual increments are not measured. All increments are pooled and analyzed collectively.

Figure 2. Example of 50 Incremental Cuts From Stratified Lot

When 50 increments are taken from the grain stream, as in Figure 2, every stratum will be cut five times. However, depending on the size of the strata and the space between incremental cuts, the number of incremental cuts taken from each stratum will vary. If this lot is sampled with only four incremental cuts, only four of the strata will be sampled. Which four strata get selected will depend on the starting point. Since each stratum represents one tenth of the lot, 10 equally spaced increments will take a cut from each stratum.

For this simulation example, each incremental cut was assumed to take 1600 kernels of corn (approximately 0.5 kg) and the increments combined into a bulk sample. Also, the measurement was for a test of percent biotech kernels in the lot. The biotech measurement was conducted on a one kilogram sub-sample from the bulk sample (3200 kernels).

Figure 3 gives the standard deviations for sample estimates based on samples of 4 to 50 increments. These are labeled as "actual" on the graph. The variability is significantly higher when fewer than 10 increments are taken from the lot. When fewer than 10 increments are taken, not all strata are represented in the sample. The variability among strata influences the variability among the estimates because the strata being selected depends on the starting point of the increment sample. When 10 or more increments are taken, all strata will be represented in the sample and the variation among strata will have a much smaller influence on the sample estimates. When the number of increments is not a multiple of 10, the strata variability will have some influence because the strata will not be equally represented.

Figure 3. Variability of Sample Estimates

Another line on the graph in Figure 3 is for a series labeled "Approx" for the approximated estimate of sample standard deviation. As stated earlier, the test sample is assumed to be a random sample from the lot. If the sample was indeed a random sample, the standard deviation from a binomial distribution would be the appropriate standard deviation to use. This is the standard deviation that is assumed to represent the variability of the sample estimate. From the graph, the binomial approximates the actual variability when at least 10 increments are taken from the lot.

When fewer than 10 increments are taken, the binomial approximation underestimates the actual variability. This observation can be extrapolated to any lot where more strata exist than increments taken. The assumption of a random sample will underestimate the actual sample variation when many small and diverse strata exist in the lot. In large bulk handling facilities, moving grain will be mixing and blending every time the grain is moved. Many small, diverse strata are not believed to exist in these facilities. The assumption that the increment sample is a random sample is probability reasonable in most circumstances.