Ph. D. & Dr. Sc. Lev Gelimson's Least Square Method Paradoxical Behavior Phenomenon by Rotating 2D Data

Least Square Method Paradoxical Behavior Phenomenon by Rotating 2D Data

© Ph. D. & Dr. Sc. Lev Gelimson

Academic Institute for Creating Fundamental Sciences (Munich, Germany)

Mathematical Journal

of the "Collegium" All World Academy of Sciences

Munich (Germany)

11 (2011), 24

By estimation, approximation, and data processing, the least square method (LSM) [1] by Legendre and Gauss only usually applies to contradictory (e.g. overdetermined) problems, by methods of finite elements, points, etc. Overmathematics [2, 3] and fundamental sciences of estimation [4], approximation [5], and data processing [6] have discovered a lot of principal shortcomings [7] of the least square method. Additionally consider its simplest approach which is typical. Minimizing the sum of the squared differences of the alone preselected coordinates (e.g., ordinates in a two-dimensional problem) of the graph of the desired approximation function and of everyone among the given data depends on this preselection, ignores the remaining coordinates, and provides no coordinate system rotation invariance and hence no objective sense of the result. Moreover, the method is correct by constant approximation or no data scatter only and gives systematic errors increasing together with data scatter and the deviation (namely declination) of an approximation from a constant.

Separate numeric tests and checks for particular methods and theories are well-known in classical mathematics [1] and natural sciences.

Metatheories of directed numeric test and check systems in overmathematics [2, 3] and fundamental sciences of estimation [4], approximation [5], data modeling [6] and processing [7] make it possible, not only to provide systematic tools of comprehensively testing and checking particular theories and methods, but also to develop them and even to create many principally new theories and methods.

Consider the least square method (LSM) [1] by rotating 2D data to be linearly approximated.

By coordinate system translation invariance of the given data, centralize them by subtracting every coordinate of the data center from the corresponding coordinate of every data point.

Given n (n ∈ N⁺ = {1, 2, ...}, n > 2) points [_j=1ⁿ (x'_j , y'_j )] = {(x'₁ , y'₁), (x'₂ , y'₂), ... , (x'_n , y'_n)] with any real coordinates. Use clearly invariant centralization transformation x = x' - Σ_j=1ⁿ x'_j / n , y = y' - Σ_j=1ⁿ y'_j / n to provide coordinate system xOy central for the given data and further work in this system with points [_j=1ⁿ (x_j , y_j)] to be approximated with a straight line y = ax containing origin O(0, 0).

Use the least square method [1] by its common approach to minimizing the sum of the squared y-coordinate differences between this line and everyone of the n data points [_j=1ⁿ (x_j , y_j)]:

²_yS(a) = Σ_j=1ⁿ (ax_j - y_j)²,

²_yS'_a = 2Σ_j=1ⁿ (ax_j - y_j)x_j = 0,

Σ_j=1ⁿ x_j² a = Σ_j=1ⁿ x_jy_j ,

a_LSM = Σ_j=1ⁿ x_jy_j / Σ_j=1ⁿ x_j² ,

^²_yS''_aa = 2Σ_j=1ⁿ x_j² > 0

(in any nontrivial case) providing namely the minimum of ²_yS(a) at a_y as the value of a by minimizing the sum of the squared y-coordinate differences.

Nota bene: Unlike quadratic mean theories in fundamental sciences of estimation [4], approximation [5], data modeling [6] and processing [7], the least square method [1] by its common approach ignores minimizing the sum of the squared x-coordinate differences between this line and everyone of the n data points [_j=1ⁿ (x_j , y_j)] and ist one-sided, which necessarily leads to its paradoxical behavior.

Hence in this case, metatheories of directed numeric test and check systems may provide considering any rotation of already centralized two-dimensional data points whose linear approximation (linear bisector) is a priori clear due to the mirror symmetry of these data.

Nota bene: Preliminarily centralizing these data provides universalization of the ranges of the both coordinate axes by any rotation of them and brings clear simplification without any loss of generality.

Determine ²S_min(A), ²S_max(A), and then

S = S_L = [²S_min(A) / ²S_max(A)]^1/2

as a measure of data scatter with respect to linear approximation.

Also introduce a measure of data trend with respect to linear approximation

T = T_L = 1 - S = 1 - S_L = 1 - [²S_min(A) / ²S_max(A)]^1/2 .

Denote:

S and T obtained via the least square method (LSM) with S_LSM and T_LSM = 1 - S_LSM , respectively;

S and T obtained via distance quadrat theories (DQT) with S_DQT and T_DQT = 1 - S_DQT , respectively;

S and T obtained via general theories of moments of inertia (GTMI) with S_GTMI and T_GTMI = 1 - S_GTMI , respectively;

S and T obtained via quadratic mean theories (QMT) with S_QMT and T_QMT = 1 - S_QMT , respectively;

S and T obtained via rotation-quasi-invariant quadratic mean theories (RQIQMT) with S_RQIQMT and T_RQIQMT = 1 - S_RQIQMT , respectively.

Nota bene: Distance quadrat theories (DQT) and general theories of moments of inertia (GTMI) are invariant by any data rotation, have different forms and approaches but the same essence and hence always coinciding results. Therefore, we always obtain

S_DQT = S_GTMI ,

T_DQT = T_GTMI ,

as well as always correct slope values

a_DQT(β) = a_GTMI(β) = a_corr(β).

To provide clearly best linear approximation (bisectors), take the given data to be central and additionally mirror-symmetric with respect to the x-axis and rotate these data about their center coinciding with the coordinate system origin.

Nota bene: To rotate these data about their center coinciding with the coordinate system origin in the same coordinate system Oxy by any angle β positive in the anticlockwise direction, also consider another coordinate system Ox'y' obtained from Oxy by its turning (rotating) about their common origin O namely by angle -β positive in the anticlockwise direction so that we have axis Ox' from Ox and axis Oy' from Oy . We obtain [1]

x' = x cos (-β) + y sin (-β) = x cos β - y sin β ,

y' = - x sin (-β) + y cos (-β) = x sin β + y cos β .

Now replace each point (x , y) with point (x' , y') and place this point (x' , y') in coordinate system Oxy but NOT in coordinate system Ox'y'.

Nota bene: Here x , y , x' , y' are real-number values without these designations. Therefore, there is no problem place this point (x' , y') in coordinate system Oxy but NOT in coordinate system Ox'y'. All the more, there is no necessity to show this auxiliary coordinate system Ox'y' used above to obtain the above transformation formulae for x' and y' via x and y .

Now we can rotate data about their center coinciding with the coordinate system origin in the same coordinate system Oxy by any angle β positive in the anticlockwise direction.

It is especially important to determine some finite set of the most characteristic angles β in half-open (and half-closed) interval [0, π) including endpoint 0 but excluding endpoint π because 2π is a period of functions sin β and cos β at all, and the centrality and additionally mirror-symmetricity of the given data with respect to the x-axis now provides the periodicity of rotating such data about their center coinciding with the coordinate system origin with period π .

This set should contain:

0 and π/2 as the discontinuity points of quadratic and other power mean theories due to the jumps of the function sign z at z = 0 both to the left and to the right;

some points in half-open (and half-closed) interval [0, π) which are very close (plot-coinciding) to the both above discontinuity points 0 (from the right only) and π/2 (both from the left and from the right), as well as π (from the left only), e.g. π/180, π/2 - π/180, π/2 + π/180, and π - π/180;

π/4 and 3π/4 as the angles of precisely coinciding linear approximation (bisectors) via distance quadrat theories (DQT), general theories of moments of inertia (GTMI), as well as both usual and rotation-quasi-invariant power mean theories (PMT) naturally generalizing quadratic mean theories (QMT). The same also holds for angles 0 and π/2 already taken;

π/8, 3π/8, 5π/8, and 7π/8 as the angles of bisectors of the angles between 0 and π/4, π/4 and π/2, π/2 and 3π/4, as well as 3π/4 and π, respectively, because we can expect that namely at these bisector angles, the differences between linear approximation (bisectors) via distance quadrat theories (DQT) and general theories of moments of inertia (GTMI), from the one side, as well as both usual and rotation-quasi-invariant power mean theories (PMT) naturally generalizing quadratic meanc mean theories (QMT) from the other side, could be near to the greatest values of these differences;

possibly some additional characteristic angles playing important roles in trigonometry, e.g. π/6, π/3, 2π/3, and 5π/6, as well as some additional intermediate angles, e.g. π/18, π/2 - π/18, π/2 + π/18, and π - π/18, to reduce the greatest angle intervals between the remaining angles.

It is reasonable to investigate linear approximation to (bisectors of) both relatively well-directed and weakly (hardly) directed (very scattered, diffuse) data by such rotations.

Now denote

x(β) = x cos β - y sin β ,

y(β) = x sin β + y cos β

and apply the least square method [1] by its common approach to minimizing the sum of the squared y-coordinate differences between this line and everyone of the n data points {_j=1ⁿ [x_j(β) , y_j(β)]}:

a_LSM(β) = Σ_j=1ⁿ x_j(β)y_j(β) / Σ_j=1ⁿ x_j²(β) .

We consequently obtain

x(β)y(β) = (x cos β - y sin β)(x sin β + y cos β) = 1/2 (x² - y²) sin 2β + xy cos 2β ,

x²(β) = (x cos β - y sin β)² = x² cos² β - xy sin 2β + y² sin² β ,

Σ_j=1ⁿ x_j(β)y_j(β) = 1/2 (Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²) sin 2β + Σ_j=1ⁿ x_jy_j cos 2β ,

Σ_j=1ⁿ x²(β) = Σ_j=1ⁿ x_j² cos² β - Σ_j=1ⁿ x_jy_j sin 2β + Σ_j=1ⁿ y_j² sin² β =

1/2 (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²) + 1/2 (Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²) cos 2β - Σ_j=1ⁿ x_jy_j sin 2β,

a_LSM(β) = [1/2 (Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²) sin 2β + Σ_j=1ⁿ x_jy_j cos 2β] / [1/2 (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²) + 1/2 (Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²) cos 2β - Σ_j=1ⁿ x_jy_j sin 2β],

da_LSM(β)/dβ = {[(Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²) cos 2β - 2Σ_j=1ⁿ x_jy_j sin 2β] [1/2 (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²) + 1/2 (Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²) cos 2β - Σ_j=1ⁿ x_jy_j sin 2β] -

[1/2 (Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²) sin 2β + Σ_j=1ⁿ x_jy_j cos 2β] [- (Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²) sin 2β - 2Σ_j=1ⁿ x_jy_j cos 2β]}/

[1/2 (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²) + 1/2 (Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²) cos 2β - Σ_j=1ⁿ x_jy_j sin 2β]² =

{1/2 (Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²)² + 2(Σ_j=1ⁿ x_jy_j)² + 1/2 [(Σ_j=1ⁿ x_j²)² - (Σ_j=1ⁿ y_j²)²] cos 2β - Σ_j=1ⁿ x_jy_j (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²) sin 2β}/

[1/2 (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²) + 1/2 (Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²) cos 2β - Σ_j=1ⁿ x_jy_j sin 2β]² .

Therefore, if slope a_LSM(β) takes its extreme values, then necessary condition

1/2 (Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²)² + 2(Σ_j=1ⁿ x_jy_j)² + 1/2 [(Σ_j=1ⁿ x_j²)² - (Σ_j=1ⁿ y_j²)²] cos 2β - Σ_j=1ⁿ x_jy_j (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²) sin 2β = 0,

or, equivalently,

(Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²)² + 4(Σ_j=1ⁿ x_jy_j)² + [(Σ_j=1ⁿ x_j²)² - (Σ_j=1ⁿ y_j²)²] cos 2β - 2Σ_j=1ⁿ x_jy_j (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²) sin 2β = 0,

is satisfied. This gives

[(Σ_j=1ⁿ y_j²)² - (Σ_j=1ⁿ x_j²)²] cos 2β + 2Σ_j=1ⁿ x_jy_j (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²) sin 2β = (Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²)² + 4(Σ_j=1ⁿ x_jy_j)² .

Divide this equation in β by

{[(Σ_j=1ⁿ y_j²)² - (Σ_j=1ⁿ x_j²)²]² + [2Σ_j=1ⁿ x_jy_j (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²)]²}^1/2

to provide that the sum of the factors by cos 2β and sin 2β equals 1.

Then find any angle φ for which the both conditions

cos φ = [(Σ_j=1ⁿ y_j²)² - (Σ_j=1ⁿ x_j²)²] / {[(Σ_j=1ⁿ y_j²)² - (Σ_j=1ⁿ x_j²)²]² + [2Σ_j=1ⁿ x_jy_j (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²)]²}^1/2 ,

sin φ = 2Σ_j=1ⁿ x_jy_j (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²) / {[(Σ_j=1ⁿ y_j²)² - (Σ_j=1ⁿ x_j²)²]² + [2Σ_j=1ⁿ x_jy_j (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²)]²}^1/2

are satisfied.

Now we have

cos (2β - φ) = [(Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²)² + 4(Σ_j=1ⁿ x_jy_j)²] / {[(Σ_j=1ⁿ y_j²)² - (Σ_j=1ⁿ x_j²)²]² + [2Σ_j=1ⁿ x_jy_j (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²)]²}^1/2 .

Finally, we obtain

2β - φ = ± arccos {[(Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²)² + 4(Σ_j=1ⁿ x_jy_j)²] / {[(Σ_j=1ⁿ y_j²)² - (Σ_j=1ⁿ x_j²)²]² + [2Σ_j=1ⁿ x_jy_j (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²)]²}^1/2} + 2πk

where

k = 0, ±1, ±2, ...

and

β = 1/2 φ ± 1/2 arccos {(Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²)² + 4(Σ_j=1ⁿ x_jy_j)²] / {[(Σ_j=1ⁿ y_j²)² - (Σ_j=1ⁿ x_j²)²]² + [2Σ_j=1ⁿ x_jy_j (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²)]²}^1/2} + πk .

Σ_j=1ⁿ x_jy_j = 0,

which is the case, e.g., by data mirror simmetricity with respect to the x-axis, then we simply obtain

cos φ = 1,

sin φ = 0,

φ = 0,

β = ± 1/2 arccos {(Σ_j=1ⁿ y_j² - Σ_j=1ⁿ x_j²)²] / [(Σ_j=1ⁿ y_j²)² - (Σ_j=1ⁿ x_j²)²]} + πk =

± 1/2 arccos [(Σ_j=1ⁿ y_j² - Σ_j=1ⁿ x_j²) / (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²)] + πk .

Naturally, we can come to the same result via directly simplifying the necessary condition

1/2 (Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²)² + 2(Σ_j=1ⁿ x_jy_j)² + 1/2 [(Σ_j=1ⁿ x_j²)² - (Σ_j=1ⁿ y_j²)²] cos 2β - Σ_j=1ⁿ x_jy_j (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²) sin 2β = 0

Σ_j=1ⁿ x_jy_j = 0.

Then

1/2 (Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²)² + 1/2 [(Σ_j=1ⁿ x_j²)² - (Σ_j=1ⁿ y_j²)²] cos 2β = 0,

(Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²)² - [(Σ_j=1ⁿ y_j²)² - (Σ_j=1ⁿ x_j²)²] cos 2β = 0,

β = ± 1/2 arccos [(Σ_j=1ⁿ y_j² - Σ_j=1ⁿ x_j²) / (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²)] + πk .

Nota bene: We obtained the necessary condition of extremum only.

If at any such critical value of β we have

d²a_LSM(β)/dβ² > 0,

then function a_LSM(β) takes here its local minimum value.

If at any such critical value of β we have

d²a_LSM(β)/dβ² < 0,

then function a_LSM(β) takes here its local maximum value.

If at any such critical value of β we have

d²a_LSM(β)/dβ² = 0,

then we have to further investigate function a_LSM(β) to determine whether it takes here its local minimum value, maximum value, or no extremum value at all. But the last condition usually determines another value of β . Check whether it is not satisfied.

We obtain

da_LSM(β)/dβ = {2(Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²)² + 8(Σ_j=1ⁿ x_jy_j)² + 2[(Σ_j=1ⁿ x_j²)² - (Σ_j=1ⁿ y_j²)²] cos 2β - 4Σ_j=1ⁿ x_jy_j (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²) sin 2β}/

[(Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²) + (Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²) cos 2β - 2Σ_j=1ⁿ x_jy_j sin 2β]² ,

d²a_LSM(β)/dβ² = {8Σ_j=1ⁿ x_jy_j (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²)(Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²) cos 4β +

(Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²) [2(Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²)² - 8(Σ_j=1ⁿ x_jy_j)²] sin 4β +

[(Σ_j=1ⁿ x_j²)² + (Σ_j=1ⁿ y_j²)² - 6Σ_j=1ⁿ x_j² Σ_j=1ⁿ y_j² + 8(Σ_j=1ⁿ x_jy_j)²][8Σ_j=1ⁿ x_jy_j cos 2β + 4(Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²) sin 2β}/

[(Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²) + (Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²) cos 2β - 2Σ_j=1ⁿ x_jy_j sin 2β]³

and have to check

sign{4Σ_j=1ⁿ x_jy_j (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²)(Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²) cos 4β + (Σ_j=1ⁿ x_j² + Σ_j=1ⁿ y_j²) [(Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²)² - 4(Σ_j=1ⁿ x_jy_j)²] sin 4β +

[(Σ_j=1ⁿ x_j²)² + (Σ_j=1ⁿ y_j²)² - 6Σ_j=1ⁿ x_j² Σ_j=1ⁿ y_j² + 8(Σ_j=1ⁿ x_jy_j)²][4Σ_j=1ⁿ x_jy_j cos 2β + 2(Σ_j=1ⁿ x_j² - Σ_j=1ⁿ y_j²) sin 2β}

(half the numerator) because Σ_j=1ⁿ x_j²(β) and hence the denominator cannot be negative.

A second approach to such investigation is to use directed numeric tests via finitely increasing and decreasing any such critical value of β and checking the sign of da_LSM(β)/dβ , or, equivalently, the left-hand side of the above necessary condition.

In the following system of directed numeric tests and checks in Figures 1-36, we have

Σ_j=1ⁿ x_j² = 220,

Σ_j=1ⁿ x_jy_j = 0

because of 2D data mirror symmetry with respect to the x-axis,

Σ_j=1ⁿ y_j² = 22,

β = β_max ≈ 72.45159939(π/180),

d²a_LSM(β)/dβ² < 0,

therefore, a_LSM(β) takes here its local maximum

a_LSM_max ≈ 1.423024947.

Nota bene: Independently of 2D data, the least square method (LSM) [1] can give slopes a_LSM(β) between -a_LSM_max and a_LSM_max only (with including these both bounds), which is paradoxical especially because these predefined bounds on slopes a_LSM(β) are relatively narrow, whereas best linear approximation to some data, e.g. lying on straight line y = ax where |a| can exceed any predefined positive number, can have a slope which lies far beyond these predefined bounds.

Unlike always correct slope a_corr(β) via distance quadrat theories (DQT) and general theories of moments of inertia (GTMI), slope a_LSM(β) via the least square method (LSM) [1] shows the following paradoxical behavior by rotating 2D data to be linearly approximated:

1. At vanishing 2D data rotation angle β , we have also vanishing slope a_LSM(β), which is correct. If β increases from 0 to β_max , then slope a_LSM(β) also increases from 0 to a_LSM_max . This change direction is correct. But the existence of this maximum is paradoxical at all. Further, arctan a_LSM(β) lags behind β , and this lag increases with increasing β . At β = β_max , we have

arctan a_LSM(β_max) ≈ 54.90319877(π/180) ≈ 0.75β_max < β_max ≈72.45159939(π/180).

2. If 2D data rotation angle β further increases from β_max to π/2, then slope a_LSM(β) turns to decrease from a_LSM_max to 0. This change direction is paradoxical at all, as well as the existence of this maximum. At π/2 from the left-hand side, a_LSM(β) should take its correct maximum value (theoretically plus infinity), which a_corr(β) does. Instead, a_LSM(β) vanishes taking its minimum value 0, which is also paradoxical. Further, arctan a_LSM(β) lags behind β , and this lag increases with increasing β .

3. At π/2 from the right-hand side, a_LSM(β) should take its correct value (theoretically minus infinity), which a_corr(β) does. Instead, a_LSM(β) vanishes, which is also paradoxical. If β increases from π/2 to π - β_max ≈ 107.5484006(π/180), then slope a_LSM(β) prolongs to decrease by its algebraic value from 0 to -a_LSM_max . This change direction is paradoxical at all, as well as the existence of the minimum -a_LSM_max (note this negative sign) of a_LSM(β). Further, arctan a_LSM(β) lags behind β .

4. If 2D data rotation angle β further increases from π - β_max to π , then slope a_LSM(β) turns to increase from the minimum -a_LSM_max (note this negative sign) of a_LSM(β) to 0. This change direction is correct. But the existence of this minimum is paradoxical at all. At π , we have vanishing slope a_LSM(β), which is correct.

For this phenomenon, note the periodicity with period π and mirror symmetricity with respect to π/2.

To show this phenomenon, also consider additional 2D data rotation angles 7π/18, 72.45159939π/180, 5π/12, 13π/30, 41π/90, 7π/15, 43π/90, 22π/45, 23π/45, 47π/90, 8π/15, 49π/90, 17π/30, 7π/12, 107.5484006π/180, and 11π/18.

Figure 1. β = 0. S_best = S_DQT = S_GTMI = S_LSM = S_QMT = S_RQIQMT ≈ 0.31623.

Figure 2. β = π/180. S_best = S_DQT = S_GTMI ≈ 0.316228. S_LSM ≈ 0.316233. S_QMT ≈ 0.432334. S_RQIQMT ≈ 0.316232.

Figure 3. β = π/18. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.3167. S_QMT ≈ 0.3604. S_RQIQMT ≈ 0.3171.

Figure 4. β = π/8. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.3189. S_QMT ≈ 0.3272. S_RQIQMT ≈ 0.3170.

Figure 5. β = π/6. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.3214. S_QMT ≈ 0.3203. S_RQIQMT ≈ 0.3166.

Figure 6. β = π/4. S_best = S_DQT = S_GTMI = S_QMT = S_RQIQMT ≈ 0.684. S_LSM ≈ 0.669.

Figure 7. β = π/3. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.3600. S_QMT ≈ 0.3203. S_RQIQMT ≈ 0.3166.

Figure 8. β = 3π/8. S_best = S_DQT = S_GTMI ≈ 0.3163. S_LSM ≈ 0.3967. S_QMT ≈ 0.3272. S_RQIQMT ≈ 0.3170.

Figure 9. β = 7π/18. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.4173. S_QMT ≈ 0.3309. S_RQIQMT ≈ 0.3172.

Figure 10. β ≈ 72.45159939π/180. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.4450. S_QMT ≈ 0.3356. S_RQIQMT ≈ 0.3172.

Figure 11. β = 5π/12. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.4858. S_QMT ≈ 0.3418. S_RQIQMT ≈ 0.3173.

Figure 12. β = 13π/30. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.5607. S_QMT ≈ 0.3518. S_RQIQMT ≈ 0.3172.

Figure 13. β = 4π/9. S_best = S_DQT = S_GTMI ≈ 0.3163. S_LSM ≈ 0.6391. S_QMT ≈ 0.3604. S_RQIQMT ≈ 0.3171.

Figure 14. β = 41π/90. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.7597. S_QMT ≈ 0.3712. S_RQIQMT ≈ 0.3169.

Figure 15. β = 7π/15. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.9601. S_QMT ≈ 0.3844. S_RQIQMT ≈ 0.3166.

Figure 16. β = 43π/90. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 1.3345. S_QMT ≈ 0.4008. S_RQIQMT ≈ 0.3164.

Figure 17. β = 22π/45. S_best = S_DQT = S_GTMI ≈ 0.31623. S_LSM ≈ 2.13554. S_QMT ≈ 0.42079. S_RQIQMT ≈ 0.31626.

Figure 18. β = 89π/180. S_best = S_DQT = S_GTMI ≈ 0.316228. S_LSM ≈ 2.772737. S_QMT ≈ 0.432334. S_RQIQMT ≈ 0.316232.

Figure 19. β = π/2. S_best = S_DQT = S_GTMI = S_RQIQMT ≈ 0.316. S_LSM ≈ 3.162. S_QMT ≈ 0.445.

Figure 20. β = 91π/180. S_best = S_DQT = S_GTMI ≈ 0.316228. S_LSM ≈ 2.772737. S_QMT ≈ 0.432334. S_RQIQMT ≈ 0.316232.

Figure 21. β = 23π/45. S_best = S_DQT = S_GTMI ≈ 0.31623. S_LSM ≈ 2.13554. S_QMT ≈ 0.42079. S_RQIQMT ≈ 0.31626.

Figure 22. β = 47π/90. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 1.3345. S_QMT ≈ 0.4008. S_RQIQMT ≈ 0.3164.

Figure 23. β = 8π/15. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.9601. S_QMT ≈ 0.3844. S_RQIQMT ≈ 0.3166.

Figure 24. β = 49π/90. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.7597. S_QMT ≈ 0.3712. S_RQIQMT ≈ 0.3169.

Figure 25. β = 5π/9. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.6391. S_QMT ≈ 0.3604. S_RQIQMT ≈ 0.3171.

Figure 26. β = 17π/30. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.5607. S_QMT ≈ 0.3518. S_RQIQMT ≈ 0.3172.

Figure 27. β = 7π/12. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.4858. S_QMT ≈ 0.3418. S_RQIQMT ≈ 0.3173.

Figure 28. β ≈ 107.5484006π/180. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.4450. S_QMT ≈ 0.3356. S_RQIQMT ≈ 0.3172.

Figure 29. β = 11π/18. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.4173. S_QMT ≈ 0.3309. S_RQIQMT ≈ 0.3172.

Figure 30. β = 5π/8. S_best = S_DQT = S_GTMI ≈ 0.3163. S_LSM ≈ 0.3967. S_QMT ≈ 0.3272. S_RQIQMT ≈ 0.3170.

Figure 31. β = 2π/3. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.3600. S_QMT ≈ 0.3203. S_RQIQMT ≈ 0.3166.

Figure 32. β = 3π/4. S_best = S_DQT = S_GTMI = S_QMT = S_RQIQMT ≈ 0.316. S_LSM ≈ 0.331.

Figure 33. β = 5π/6. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.3214. S_QMT ≈ 0.3203. S_RQIQMT ≈ 0.3166.

Figure 34. β = 7π/8. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.3189. S_QMT ≈ 0.3272. S_RQIQMT ≈ 0.3170.

Figure 35. β = 17π/18. S_best = S_DQT = S_GTMI ≈ 0.3162. S_LSM ≈ 0.3167. S_QMT ≈ 0.3604. S_RQIQMT ≈ 0.3171.

Figure 36. β = 179π/180. S_best = S_DQT = S_GTMI ≈ 0.316228. S_LSM ≈ 0.316233. S_QMT ≈ 0.432334. S_RQIQMT ≈ 0.316232.

Nota bene:

1. Invert impossible values S_LSM > 1 to obtain correct values of S_LSM also providing correct values of T_LSM . Such impossible values show paradoxical mutual replacement of the maximum and minimum values of the sums of the above squared distances. The cause of this replacement is that the least square method paradoxically gives (quasi)horizontal linear approximation to 2D data with (quasi)vertical best linear approximation.

2. Adequate distance quadrat theories (DQT), general theories of moments of inertia (GTMI), quadratic mean theories (QMT) and other power mean theories (PMT), as well as rotation-quasi-invariant quadratic mean theories (RQIQMT) and other power mean theories (RQIPMT) are very efficient in data estimation, approximation, and processing and reliable even by great data scatter and give, e.g., best linear approximation to any 2D data.

Acknowledgements to Anatolij Gelimson for our constructive discussions on coordinate system transformation invariances and his very useful remarks.

References

[1] Encyclopaedia of Mathematics. Ed. M. Hazewinkel. Volumes 1 to 10. Kluwer Academic Publ., Dordrecht, 1988-1994

[2] Lev Gelimson. Providing Helicopter Fatigue Strength: Flight Conditions. In: Structural Integrity of Advanced Aircraft and Life Extension for Current Fleets – Lessons Learned in 50 Years After the Comet Accidents, Proceedings of the 23rd ICAF Symposium, Dalle Donne, C. (Ed.), 2005, Hamburg, Vol. II, 405-416

[3] Lev Gelimson. Overmathematics: Fundamental Principles, Theories, Methods, and Laws of Science. The ”Collegium” All World Academy of Sciences Publishers, Munich, 2010

[4] Lev Gelimson. Fundamental Science of Estimation. The ”Collegium” All World Academy of Sciences Publishers, Munich, 2010

[5] Lev Gelimson. Fundamental Science of Approximation. The ”Collegium” All World Academy of Sciences Publishers, Munich, 2010

[6] Lev Gelimson. Fundamental Science of Data Modeling. The ”Collegium” All World Academy of Sciences Publishers, Munich, 2010

[7] Lev Gelimson. Fundamental Science of Data Processing. The ”Collegium” All World Academy of Sciences Publishers, Munich, 2010

[8] Lev Gelimson. Corrections and Generalizations of the Least Square Method. In: Review of Aeronautical Fatigue Investigations in Germany during the Period May 2007 to April 2009, Ed. Dr. Claudio Dalle Donne, Pascal Vermeer, CTO/IW/MS-2009-076 Technical Report, International Committee on Aeronautical Fatigue, ICAF 2009, EADS Innovation Works Germany, 2009, 59-60