The Use of Orthogonal Functions to Characterize Probability Distributions

 

A seldom used technique, but one which can be computationally quite efficient, uses orthogonal functions to parameterize a model of an empirical probability distribution.

 

Simply stated, the probability distribution is represented as a sum of polynomials (or other simply calculated functions) orthogonal with respect to a particular weight related to the expected distribution.  The advantage is that, since observed events’ probability distributions are precisely delta distributions centered at the event’s value with weight 1/N, integration of the observed probability distribution (a sum of delta functions) with the corresponding polynomial to calculate the corresponding coefficient consists of a simple weighted sum over the observations.

 

Why is this used so infrequently?

 

For several reasons. One obvious reason is that the probability distribution represented this way is not guaranteed to be “physical.” That is, the estimated probability distribution may have portions greater than unity or less than zero.  Another issue is the appropriate selection of the weighting function, which, ideally, is very close to the actual probability distribution, so the lowest order polynomial (or function), which is usually chosen to be constant and unity, is nearly one.

 

From one point of view, these objections can be considered advantages. If the estimated probability function is significantly non-physical, or if much of the distribution’s “energy” is in higher order terms, then this implies that the underlying distribution model is probably flawed and needs to be reexamined.  On the other hand, if the real distribution is multimodal, the orthogonal function method may be a very efficient way of determining this.