Higher-Order Histosplines: Superior Density Estimation for Binned Data

Michael C. Minnotte
Utah State University, Logan, USA


ABSTRACT

A method is proposed for achieving fast (O(n^-8/9) and faster) asymptotic mean integrated squared error convergence rates in density estimation using data which is condensed to standard histogram bin counts and edges. Such an approach is useful both when data is collected in binned form, and for storage and computational savings over unbinned methods with huge data sets. The method involves weighting B-splines appropriately to restore the correct mass proportions property, in which the probability mass for the histogram bins equals exactly the fraction of the data found in those bins. Computational and visual aspects of the new estimator will be examined, and comparisons with kernel estimators will be conducted.




Michael C. Minnotte
Department of Mathematics and Statistics, Utah State University,
Logan, UT 84322-3900
minnotte@math.usu.edu