After collecting a set of parameters (and Bayesian LAMARC will soon be able to output 'summary files', which would contain this information), the information is divvied into groups on a per-parameter and per-replicate basis. The data for each parameter is then smoothed using a Biweight kernel of the form:
The width of this kernel is set as:
where n is the number of points in the data set, and σ is the smaller of the inter-quartile distance divided by 1.34, or the standard deviation of the data set. (If this value falls to zero or near-zero, we substitute an arbitrary minimum to allow the program to continue).
After the data for one parameter from one replicate and one region are smoothed, multi-replicate curves are created by averaging the individual curves, while multi-region curves are created by multiplying and re-normalizing the single-region curves. (Multiple replicates are different views of the same data, and are therefore averaged. Multiple regions are independent views using independent data, and are therefore multiplied.)
The biweight kernel was chosen primarily because it is bounded and quick to calculate. A bounded kernel (unlike a Gaussian kernel) ensures that the the resulting probability density curve has defined boundaries, which better matches the priors for the parameters we estimate. (An even better kernel would be additionally constrained to not spread out over the bounds of the prior, but this works for a basic attempt at analysis.) Silverman (1986) has demonstrated that the choice of kernels does not make an appreciable difference in accuracy otherwise, so the biweight kernel was chosen over other bounded kernels simply because it was simple to calculate.
The formula for the kernel width was taken from Silverman (1986) as modified for use for the biweight kernel. The formula for figuring out the optimal kernel weight (hopt) is:
where k(t) is the kernel function, f(x) is the function you wish to estimate, and n is the number of data points. If a Gaussian is used as an estimate of f(x), and the biweight kernel is used for k(t), this reduces to:
or
Silverman argues that you can account for the fact that your function is unknown (instead of definitely Gaussian, as we assumed in the initial calculation), by reducing the coefficient slightly (which is why we use 2.5 instead of 2.78), and use the lesser of the inter-quartile distance divided by 1.34 and the standard deviation of the data for σ. Using both of these substitutions, we arrive at the equation listed at the beginning: