General Power Data Scatter and Trend Measure and Estimation Theories

by

© Ph. D. & Dr. Sc. Lev Gelimson

Academic Institute for Creating Fundamental Sciences (Munich, Germany)

Mathematical Journal

of the "Collegium" All World Academy of Sciences

Munich (Germany)

11 (2011), 12

By data modeling, processing, estimation, and approximation [1], data scatter is often so great that it is not possible to directly determine the most suitable analytic approximation expression type or form, e.g. linear, piecewise linear, parabolic, hyperbolic, circumferential, elliptic, sinusoidal, etc. by two-dimensional data or linear, piecewise linear, paraboloidal, hyperboloidal, spherical, ellipsoidal, etc. by three-dimensional data. Before considering any approximation to such data, decide whether it can be regarded as directed at all. Therefore, it is necessary and very useful to precisely measure or at least to namely quantitatively estimate data scatter and trend (directedness).

Apparently, there are no known measures of data scatter and trend (directedness). All the more, purely qualitative estimations of data scatter (e.g. data scatter is often great etc.) only are no known. Generally, data direction and trend (directedness) are not considered at all. The nearest known concepts are principal central axes of continual areas and volumes by determining their moments of inertia in mechanics, and namely linear directions only are usually considered.

In overmathematics [2, 3] and fundamental sciences of estimation [4], approximation [5], data modeling and processing [6], data can be any possibly mixed quantisets (e.g. including discrete and continual parts), data directions can be any linear or nonlinear, and universal invariant relative (dimensionless) precise measures and (quantitative) estimations of data scatter and trend (directedness) are introduced. Such measures and estimations can be, e.g., quadratic [7]. But the second power can be insufficient at all not only by the least square method [8]. Hence the 4th power is also used [9].

Now consider the general case of any positive power p . Namely, take the pth powers of distances of data points from a bisector (an approximation) and the sums pS of these powers into account. Such a sum can be also considered as a general moment J of inertia of order p .

In the simplest case of 2D discrete data, determine pSmin and pSmax and then

pS = [pSmin / pSmax]1/p

as a measure of data scatter with respect to the pth power of a distance.

Also introduce a measure of data trend with respect to the pth power of a distance

pT = 1 - pS = 1 - [pSmin / pSmax]1/p .

As ever, the fundamental principle of tolerable simplicity [2, 3] plays a key role.

Naturally, we have 0 ≤ pS ≤ 1 and 0 ≤ pT ≤ 1.

Nota bene: In principle, to defining and determining such measures of data scatter and trend, also ratio Jmin / Jmax = pSmin / pSmaxitself and many (other than the above) suitable functions of this ratio can be applied. But using namely the above functions seems to be the most adequate and natural. Consider a rectangle whose length L is much greater then width W. The moments of inertia of this rectangle about its longitudinal and transversal central axes are

JL = LWp+1/[(p + 1)2p]

and

JT = Lp+1W/[(p + 1)2p],

respectively. These axes are its axes of symmetry and hence namely its principal central axes (at least, for p = 2). Relation L >> W provides relation JL << JT . Hence

Jmin = JL = LWp+1/[(p + 1)2p],

Jmax = JT = Lp+1W/[(p + 1)2p],

Jmin / Jmax = Wp/Lp .

Now we see that using namely the above functions provide very simple and natural formulae for the rectangle scatter and trend measures

pS = W/L ,

pT = 1 - W/L .

These theories are very efficient in data estimation, approximation, and processing.

Acknowledgements to Anatolij Gelimson for our constructive discussions on coordinate system transformation invariances and his very useful remarks.

References

[1] Encyclopaedia of Mathematics. Ed. M. Hazewinkel. Volumes 1 to 10. Kluwer Academic Publ., Dordrecht, 1988-1994

[2] Lev Gelimson. Providing Helicopter Fatigue Strength: Flight Conditions [Overmathematics and Other Fundamental Mathematical Sciences]. In: Structural Integrity of Advanced Aircraft and Life Extension for Current Fleets – Lessons Learned in 50 Years After the Comet Accidents, Proceedings of the 23rd ICAF Symposium, Dalle Donne, C. (Ed.), 2005, Hamburg, Vol. II, 405-416

[3] Lev Gelimson. Overmathematics: Fundamental Principles, Theories, Methods, and Laws of Science. The ”Collegium” All World Academy of Sciences Publishers, Munich, 2010

[4] Lev Gelimson. Fundamental Science of Estimation. The ”Collegium” All World Academy of Sciences Publishers, Munich, 2010

[5] Lev Gelimson. Fundamental Science of Approximation. The ”Collegium” All World Academy of Sciences Publishers, Munich, 2010

[6] Lev Gelimson. Fundamental Science of Data Modeling and Processing. The ”Collegium” All World Academy of Sciences Publishers, Munich, 2010

[7] Lev Gelimson. General Data Direction, as well as Scatter and Trend Measure and Estimation Theories (Essential). Mathematical Journal of the “Collegium” All World Academy of Sciences, Munich (Germany), 11 (2011), 10

[8] Lev Gelimson. Corrections and Generalizations of the Least Square Method. In: Review of Aeronautical Fatigue Investigations in Germany during the Period May 2007 to April 2009, Ed. Dr. Claudio Dalle Donne, Pascal Vermeer, CTO/IW/MS-2009-076 Technical Report, International Committee on Aeronautical Fatigue, ICAF 2009, EADS Innovation Works Germany, 2009, 59-60

[9] Lev Gelimson. Least Biquadratic Distance Theories in Fundamental Sciences of Estimation, Approximation, Data Modeling and Processing (Essential). Mathematical Journal of the “Collegium” All World Academy of Sciences, Munich (Germany), 11 (2011), 11