OSSIM - Open Source Software Image Map  Version 1.9.0 (20180803)
newmatnl.h
Go to the documentation of this file.
1 //$$ newmatnl.h definition file for non-linear optimisation
2 
3 // Copyright (C) 1993,4,5: R B Davies
4 
5 #ifndef NEWMATNL_LIB
6 #define NEWMATNL_LIB 0
7 
8 #include <ossim/matrix/newmat.h>
9 
10 #ifdef use_namespace
11 namespace NEWMAT {
12 #endif
13 
14 
15 
16 /*
17 
18 This is a beginning of a series of classes for non-linear optimisation.
19 
20 At present there are two classes. FindMaximum2 is the basic optimisation
21 strategy when one is doing an optimisation where one has first
22 derivatives and estimates of the second derivatives. Class
23 NonLinearLeastSquares is derived from FindMaximum2. This provides the
24 functions that calculate function values and derivatives.
25 
26 A third class is now added. This is for doing maximum-likelihood when
27 you have first derviatives and something like the Fisher Information
28 matrix (eg the variance covariance matrix of the first derivatives or
29 minus the second derivatives - this matrix is assumed to be positive
30 definite).
31 
32 
33 
34  class FindMaximum2
35 
36 Suppose T is the ColumnVector of parameters, F(T) the function we want
37 to maximise, D(T) the ColumnVector of derivatives of F with respect to
38 T, and S(T) the matrix of second derivatives.
39 
40 Then the basic iteration is given a value of T, update it to
41 
42  T - S.i() * D
43 
44 where .i() denotes inverse.
45 
46 If F was quadratic this would give exactly the right answer (except it
47 might get a minimum rather than a maximum). Since F is not usually
48 quadratic, the simple procedure would be to recalculate S and D with the
49 new value of T and keep iterating until the process converges. This is
50 known as the method of conjugate gradients.
51 
52 In practice, this method may not converge. FindMaximum2 considers an
53 iteration of the form
54 
55  T - x * S.i() * D
56 
57 where x is a number. It tries x = 1 and uses the values of F and its
58 slope with respect to x at x = 0 and x = 1 to fit a cubic in x. It then
59 choses x to maximise the resulting function. This gives our new value of
60 T. The program checks that the value of F is getting better and carries
61 out a variety of strategies if it is not.
62 
63 The program also has a second strategy. If the successive values of T
64 seem to be lying along a curve - eg we are following along a curved
65 ridge, the program will try to fit this ridge and project along it. This
66 does not work at present and is commented out.
67 
68 FindMaximum2 has three virtual functions which need to be over-ridden by
69 a derived class.
70 
71  void Value(const ColumnVector& T, bool wg, Real& f, bool& oorg);
72 
73 T is the column vector of parameters. The function returns the value of
74 the function to f, but may instead set oorg to true if the parameter
75 values are not valid. If wg is true it may also calculate and store the
76 second derivative information.
77 
78  bool NextPoint(ColumnVector& H, Real& d);
79 
80 Using the value of T provided in the previous call of Value, find the
81 conjugate gradients adjustment to T, that is - S.i() * D. Also return
82 
83  d = D.t() * S.i() * D.
84 
85 NextPoint should return true if it considers that the process has
86 converged (d very small) and false otherwise. The previous call of Value
87 will have set wg to true, so that S will be available.
88 
89  Real LastDerivative(const ColumnVector& H);
90 
91 Return the scalar product of H and the vector of derivatives at the last
92 value of T.
93 
94 The function Fit is the function that calls the iteration.
95 
96  void Fit(ColumnVector&, int);
97 
98 The arguments are the trial parameter values as a ColumnVector and the
99 maximum number of iterations. The program calls a DataException if the
100 initial parameters are not valid and a ConvergenceException if the
101 process fails to converge.
102 
103 
104  class NonLinearLeastSquares
105 
106 This class is derived from FindMaximum2 and carries out a non-linear
107 least squares fit. It uses a QR decomposition to carry out the
108 operations required by FindMaximum2.
109 
110 A prototype class R1_Col_I_D is provided. The user needs to derive a
111 class from this which includes functions the predicted value of each
112 observation its derivatives. An object from this class has to be
113 provided to class NonLinearLeastSquares.
114 
115 Suppose we observe n normal random variables with the same unknown
116 variance and such the i-th one has expected value given by f(i,P)
117 where P is a column vector of unknown parameters and f is a known
118 function. We wish to estimate P.
119 
120 First derive a class from R1_Col_I_D and override Real operator()(int i)
121 to give the value of the function f in terms of i and the ColumnVector
122 para defined in class R1_CoL_I_D. Also override ReturnMatrix
123 Derivatives() to give the derivates of f at para and the value of i
124 used in the preceeding call to operator(). Return the result as a
125 RowVector. Construct an object from this class. Suppose in what follows
126 it is called pred.
127 
128 Now constuct a NonLinearLeastSquaresObject accessing pred and optionally
129 an iteration limit and an accuracy critierion.
130 
131  NonLinearLeastSquares NLLS(pred, 1000, 0.0001);
132 
133 The accuracy critierion should be somewhat less than one and 0.0001 is
134 about the smallest sensible value.
135 
136 Define a ColumnVector P containing a guess at the value of the unknown
137 parameter, and a ColumnVector Y containing the unknown data. Call
138 
139  NLLS.Fit(Y,P);
140 
141 If the process converges, P will contain the estimates of the unknown
142 parameters. If it does not converge an exception will be generated.
143 
144 The following member functions can be called after you have done a fit.
145 
146 Real ResidualVariance() const;
147 
148 The estimate of the variance of the observations.
149 
150 void GetResiduals(ColumnVector& Z) const;
151 
152 The residuals of the individual observations.
153 
154 void GetStandardErrors(ColumnVector&);
155 
156 The standard errors of the observations.
157 
158 void GetCorrelations(SymmetricMatrix&);
159 
160 The correlations of the observations.
161 
162 void GetHatDiagonal(DiagonalMatrix&) const;
163 
164 Forms a diagonal matrix of values between 0 and 1. If the i-th value is
165 larger than, say 0.2, then the i-th data value could have an undue
166 influence on your estimates.
167 
168 
169 */
170 
172 {
173  virtual void Value(const ColumnVector&, bool, Real&, bool&) = 0;
174  virtual bool NextPoint(ColumnVector&, Real&) = 0;
175  virtual Real LastDerivative(const ColumnVector&) = 0;
176 public:
177  void Fit(ColumnVector&, int);
178  virtual ~FindMaximum2() {} // to keep gnu happy
179 };
180 
182 {
183  // The prototype for a Real function of a ColumnVector and an
184  // integer.
185  // You need to derive your function from this one and put in your
186  // function for operator() and Derivatives() at least.
187  // You may also want to set up a constructor to enter in additional
188  // parameter values (that will not vary during the solve).
189 
190 protected:
191  ColumnVector para; // Current x value
192 
193 public:
194  virtual bool IsValid() { return true; }
195  // is the current x value OK
196  virtual Real operator()(int i) = 0; // i-th function value at current para
197  virtual void Set(const ColumnVector& X) { para = X; }
198  // set current para
199  bool IsValid(const ColumnVector& X)
200  { Set(X); return IsValid(); }
201  // set para, check OK
202  Real operator()(int i, const ColumnVector& X)
203  { Set(X); return operator()(i); }
204  // set para, return value
205  virtual ReturnMatrix Derivatives() = 0;
206  // return derivatives as RowVector
207  virtual ~R1_Col_I_D() {} // to keep gnu happy
208 };
209 
210 
212 {
213  // these replace the corresponding functions in FindMaximum2
214  void Value(const ColumnVector&, bool, Real&, bool&);
215  bool NextPoint(ColumnVector&, Real&);
216  Real LastDerivative(const ColumnVector&);
217 
218  Matrix X; // the things we need to do the
219  ColumnVector Y; // QR triangularisation
220  UpperTriangularMatrix U; // see the write-up in newmata.txt
222  Real errorvar, criterion;
223  int n_obs, n_param;
228  R1_Col_I_D& Pred; // Reference to predictor object
229  int Lim; // maximum number of iterations
230 
231 public:
232  NonLinearLeastSquares(R1_Col_I_D& pred, int lim=1000, Real crit=0.0001)
233  : criterion(crit), Pred(pred), Lim(lim) {}
234  void Fit(const ColumnVector&, ColumnVector&);
235  Real ResidualVariance() const { return errorvar; }
236  void GetResiduals(ColumnVector& Z) const { Z = Y; }
237  void GetStandardErrors(ColumnVector&);
238  void GetCorrelations(SymmetricMatrix&);
239  void GetHatDiagonal(DiagonalMatrix&) const;
240 
241 private:
242  void MakeCovariance();
243 };
244 
245 
246 // The next class is the prototype class for calculating the
247 // log-likelihood.
248 // I assume first derivatives are available and something like the
249 // Fisher Information or variance/covariance matrix of the first
250 // derivatives or minus the matrix of second derivatives is
251 // available. This matrix must be positive definite.
252 
253 class LL_D_FI
254 {
255 protected:
256  ColumnVector para; // current parameter values
257  bool wg; // true if FI matrix wanted
258 
259 public:
260  virtual void Set(const ColumnVector& X) { para = X; }
261  // set parameter values
262  virtual void WG(bool wgx) { wg = wgx; }
263  // set wg
264 
265  virtual bool IsValid() { return true; }
266  // return true is para is OK
267  bool IsValid(const ColumnVector& X, bool wgx=true)
268  { Set(X); WG(wgx); return IsValid(); }
269 
270  virtual Real LogLikelihood() = 0; // return the loglikelihhod
271  Real LogLikelihood(const ColumnVector& X, bool wgx=true)
272  { Set(X); WG(wgx); return LogLikelihood(); }
273 
274  virtual ReturnMatrix Derivatives() = 0;
275  // column vector of derivatives
276  virtual ReturnMatrix FI() = 0; // Fisher Information matrix
277  virtual ~LL_D_FI() {} // to keep gnu happy
278 };
279 
280 // This is the class for doing the maximum likelihood estimation
281 
282 class MLE_D_FI : public FindMaximum2
283 {
284  // these replace the corresponding functions in FindMaximum2
285  void Value(const ColumnVector&, bool, Real&, bool&);
286  bool NextPoint(ColumnVector&, Real&);
287  Real LastDerivative(const ColumnVector&);
288 
289  // the things we need for the analysis
290  LL_D_FI& LL; // reference to log-likelihood
291  int Lim; // maximum number of iterations
292  Real Criterion; // convergence criterion
293  ColumnVector Derivs; // for the derivatives
294  LowerTriangularMatrix LT; // Cholesky decomposition of FI
297 
298 public:
299  MLE_D_FI(LL_D_FI& ll, int lim=1000, Real criterion=0.0001)
300  : LL(ll), Lim(lim), Criterion(criterion) {}
301  void Fit(ColumnVector& Parameters);
302  void GetStandardErrors(ColumnVector&);
303  void GetCorrelations(SymmetricMatrix&);
304 
305 private:
306  void MakeCovariance();
307 };
308 
309 
310 #ifdef use_namespace
311 }
312 #endif
313 
314 
315 
316 #endif
317 
318 // body file: newmatnl.cpp
319 
320 
321 
322 
virtual ~LL_D_FI()
Definition: newmatnl.h:277
double Real
Definition: include.h:57
ColumnVector para
Definition: newmatnl.h:191
Real LogLikelihood(const ColumnVector &X, bool wgx=true)
Definition: newmatnl.h:271
LowerTriangularMatrix LT
Definition: newmatnl.h:294
DiagonalMatrix SE
Definition: newmatnl.h:296
LL_D_FI & LL
Definition: newmatnl.h:290
SymmetricMatrix Covariance
Definition: newmatnl.h:226
int Lim
Definition: newmatnl.h:291
virtual ~FindMaximum2()
Definition: newmatnl.h:178
virtual void Set(const ColumnVector &X)
Definition: newmatnl.h:260
void GetResiduals(ColumnVector &Z) const
Definition: newmatnl.h:236
SymmetricMatrix Covariance
Definition: newmatnl.h:295
DiagonalMatrix SE
Definition: newmatnl.h:227
virtual void Set(const ColumnVector &X)
Definition: newmatnl.h:197
MLE_D_FI(LL_D_FI &ll, int lim=1000, Real criterion=0.0001)
Definition: newmatnl.h:299
ColumnVector M
Definition: newmatnl.h:221
bool IsValid(const ColumnVector &X, bool wgx=true)
Definition: newmatnl.h:267
R1_Col_I_D & Pred
Definition: newmatnl.h:228
bool IsValid(const ColumnVector &X)
Definition: newmatnl.h:199
bool wg
Definition: newmatnl.h:257
virtual void WG(bool wgx)
Definition: newmatnl.h:262
Real operator()(int i, const ColumnVector &X)
Definition: newmatnl.h:202
Definition: newmat.h:543
UpperTriangularMatrix U
Definition: newmatnl.h:220
NonLinearLeastSquares(R1_Col_I_D &pred, int lim=1000, Real crit=0.0001)
Definition: newmatnl.h:232
ColumnVector Derivs
Definition: newmatnl.h:293
virtual bool IsValid()
Definition: newmatnl.h:265
virtual bool IsValid()
Definition: newmatnl.h:194
const ColumnVector * DataPointer
Definition: newmatnl.h:224
Real Criterion
Definition: newmatnl.h:292
Real ResidualVariance() const
Definition: newmatnl.h:235
virtual ~R1_Col_I_D()
Definition: newmatnl.h:207
ColumnVector para
Definition: newmatnl.h:256
ColumnVector Y
Definition: newmatnl.h:219