# Tuning the parameters for a particular system

The default parameters in kART are presented only to give a guide of
that simulation is working. For production the input parameters have to
be tuned according to the particular system in order to reduce the
number of force computations by event found. To understand how to reduce
them (see *evalf* below), we do a first test/run and analyze the output
files *sortiproc.n* (this output is not printed by default, so you have
to set the logical variable `PRINT_DETAILS` to true). Below, we split
the analysis in two steps.

*Leaving the basin*, is the first analysis to do before finding the
saddle point. To understand how to reduce the number of force
computations we analyse the variable *kter*, which counters the number
of iterations for leaving the basing. How often the data is printed is
controlled by the variable `KPRINT`. As before *delr* gives the sum of
the maximum displacement at each step. Here a fragment from one of the
files *sortiproc.n*:

     1) Number of displaced atoms initially: 15 
     2) kter: 0 Energy: -4119.350998 e-val: 0.5654E+00 delr: 0.109793 
     3) kter: 1 Energy: -4119.286709 e-val: 0.5081E+00 delr: 0.207569 
     4) kter: 2 Energy: -4119.123240 e-val: 0.5070E+00 delr: 0.303452 
     5) kter: 3 Energy: -4118.871215 e-val: 0.5062E+00 delr: 0.397573 
     6) kter: 4 Energy: -4118.540492 e-val: 0.5054E+00 delr: 0.490058 
     7) kter: 5 Energy: -4118.139916 e-val: 0.5063E+00 delr: 0.581018
     8) kter: 6 Energy: -4117.676847 e-val: 0.4374E+00 delr: 0.670523
     9) kter: 7 Energy: -4117.156694 e-val: -0.7176E-01 delr: 0.758666 
    10) kter: 8 Energy: -4116.584468 e-val: -0.4431E+00 delr: 0.845545 
    11) kter: 9 Energy: -4115.965492 e-val: -0.1025E+01 delr: 0.931182

Leaving the basin can be accelerated by setting `INCREMENT_SIZE` or
`MIN_NUMBER_KSTEPS`. Increasing the former gives larger steps while the
latter reduces the number of forces computed per *kter* step. That is,
we know that we are out of basin when the eigenvalue (e-val) becomes
negative, therefore, the computation of lanczos can be avoided during
the first steps while the eigenvalue is positive. For example from the
lines above we could set `MIN_NUMBER_KSTEPS` to 8, so during the first 6
steps (2 is rested always), lanczos routine will not be called avoiding
the computation of additional 16$\times6=96$ force calls (as per step we
compute 16 forces if `NUMBER_LANCZOS_VECTORS` is set to 15, plus 3 more
for perpendicular relaxation using the FIRE algorithm. The goal of this
relaxation is to avoid collisions).

Variables `CHECK_LANCZOS_STAB` and `NBRE_POINTS_LANCZOS` have been added
to check eigenvalues stability which is obtained by numerical derivative
in the Lanczos scheme. `CHECK_LANCZOS_STAB` will check the eigenvalues
stability for a defined value of `NUMBER_LANCZOS_VECTORS` and
`NBRE_POINTS_LANCZOS` and for a range of $1 \times 10^{-1}$ to
$1 \times 10^{-7}$ for the step size of the numerical derivative (which
override the `LANCZOS_STEP` variable during the test.
`NBRE_POINTS_LANCZOS` is the numerical method used to compute numerical
derivative. if it is set to 1 then kART will perform a Newton's
difference quotient $\frac{f(x+h)-f(x)}{h}$ . Also, if it is set to 2
then kART will perform a more accurate approximation which is the
symmetric difference quotient $\frac{f(x+h)-f(x)}{h}$ .

Warning, if `INCREMENT_SIZE` or `MIN_NUMBER_KSTEPS` are large the
displacement in the initial direction can be aggressive, ignoring small
saddle points and conducting some events to unphysical saddle points. If
`MIN_NUMBER_KSTEPS` is too large, for *kter* smaller than
`MIN_NUMBER_KSTEPS` the eigenvalue is not evaluated and the system
continues moving in a fixed initial direction that may converge to a
saddle point that is not directly connected to the actual local minimum.
We point out that the ideal is to follow the eigenstate as soon as the
eigenvalue reaches the negative threshold (to jump to *iter* loop, see
below) and not to follow the initial deformation.

The previous analysis is strongly depending of the potential type used
and the complexity of the system. For potentials like ReaxFF or complex
systems care must be taken. It is important realize the complexity of
the system under study and to set `INCREMENT_SIZE` and
`MIN_NUMBER_KSTEPS` according. Analyzing one event is not enough,
severals have to be seen to fix these parameters.

*Convergence to the saddle point*, this is the second analysis and it is
a critical issue because it is in this part that most of the
computational resources are used computing forces. To find a successful
event we use a research of the word 'TRUE\". For example, the linux
command `grep -c TRUE sortiproc.*` will give the number of succesful
events for each *sortiproc.n* file (to see the number of unsuccessful
events try 'FALSE\"). Here a fragment from one of these successful
events from one file *sortiproc.n*:

    iter: 2 ener: -4114.6438 fpar: -2.537541 fperp: 7.234678 e-val: -0.2564E+01 delr: 1.3753 npart: 18 evalf: 474
    iter: 4 ener: -4118.3530 fpar: -0.350070 fperp: 1.257846 e-val: -0.7674E+01 delr: 0.9207 npart: 6  evalf: 550
    iter: 8 ener: -4118.5300 fpar: -0.002114 fperp: 0.116685 e-val: -0.4234E+01 delr: 0.8696 npart: 5 evalf: 678
    iter: 10 ener: -4118.5334 fpar: -0.000549 fperp: 0.068056 e-val: -0.3966E+01 delr: 0.8697 npart: 5 evalf: 742
    iter: 12 ener: -4118.5347 fpar: -0.000140 fperp: 0.045462 e-val: -0.3903E+01 delr: 0.8704 npart: 5 evalf: 806 
    iter: 14 ener: -4118.5354 fpar: -0.000033 fperp: 0.032623 e-val:-0.3905E+01 delr: 0.8711 npart: 5 evalf: 870 
    iter: 16 ener: -4118.5357 fpar: -0.000008 fperp: 0.024442 e-val: -0.3923E+01 delr: 0.8718 npart: 5 evalf: 934 
    iter: 18 ener: -4118.5359 fpar: -0.000011 fperp: 0.018805 e-val: -0.3935E+01 delr: 0.8724 npart: 5 evalf: 998  
    iter: 20 ener: -4118.5360 fpar: -0.000003 fperp: 0.014709 e-val:-0.3931E+01 delr: 0.8729 npart: 5 evalf: 1062 
    iter: 22 ener: -4118.5361 fpar: -0.000002 fperp: 0.011623 e-val: -0.3929E+01 delr: 0.8733 npart: 5 evalf: 1126 
    iter: 24 ener: -4118.5361 fpar: -0.000001 fperp: 0.009943 e-val: -0.3932E+01 delr: 0.8736 npart: 5 evalf: 1186 
    current energy -4118.53611 
    Finding saddle, success TRUE: ret 20024
    Mincounter is : 19 and scounter is 19 
    Configuration stored in file sad19
    Initial-SADDLE displacement (npart): 0.87357227 ( 5) 
    dot product between displacement and projection: 9.34331E-01 
    Pushing with length: 8.735722842777609E-002
    We now minimise to new minimum 
    initial energy : TTL -4118.55312094169
    Total energy M: -4119.3487 P: 0.006125 fnorm: 0.167860 fmax: 0.075157 vnorm: 0.001339 dt: 0.059487 
    FIRE:Minimization successful 
    fnrm: 0.00029416 ,fmax: 0.00498449 ,iter: 78
    FINAL energy -4119.35149889 
    Initial-FINAL displacement (1.68820790 ( 5) Evalf: 3515 
    Atom that moved to the most to SADDLE: 15 
    Event recentered around atom 1 color: C active: T 1 T
    worker 1) returned for atom: 1 and barrier 0.8154

*iter*: counter of iterations after leaving the basing. How often the
data is printed is controlled by the variable `IPRINT` (in this case is
every 2 iterations).

*e-val*: eigenvalue. When converging to a saddle point the eigenvalue
should be stable, that is, no larger variations should be observed. This
parameter must to converge to the lowest eigenvalue of the hessian
matrix evaluated at the saddle point as well as its respective
eigenvalue.

*fperp* and *fpar*: perpendicular and parallel components of the force.
Realize that *fpar* converges quadratically to zero while *fperp* does
it linearly [@Cances2009]. Therefore the convergence is dominated by
*fpar*, i.e, the system converges when *fpar* is less than certain
threshold defined by the variable `FINE_EXIT_FORCE_THRESHOLD`.

*ener*: convergence to energy of the saddle point.

*evalf*: number of force evaluations.

*npart*: number of particles that move most after certain distance
threshold (`DISPLACEMENT_THRESHOLD`, by default is 0.1 Å).

From this example we can see that with a `FINE_EXIT_FORCE_THRESHOLD` of
0.117, the energy converges with 6 significant digits after *iter* $=8$,
while 24 iterations are required for a convergence with 8 digits if a
`FINE_EXIT_FORCE_THRESHOLD` is set to 0.01 (the value set in this
example, as the code stops when *fperp* $=0.009943<0.01$). Of course 24
steps are OK, but for some systems the number of steps can become very
large. Therefore it is important to see the convergence of the energy
and the force in order to avoid unnecessary force calculations.

On the other hand, it is important also to see how many steps in average
are needed to reach saddle points. If *iter* becomes to large, it means
that maybe the initial deformation is not good. In this case it should
be better to stop the convergence by setting a maximum number of
iteration and restart with a different deformation (this is less costly
than to do for example 1000 *iter* steps to find only one saddle point).
The maximum number of *iter* steps is fixed by the variable
`MAX_ITER_ACTIVATION`.

For each topology several attempts/trials are done for finding a saddle.
From line 12 we see that 1186 force calls have been done for TRUE saddle
finding (energy barrier of 0.8154 eV, see line 29), but from line 26 we
see that the real cost is 3515 force calls, as several previous attempts
are done with FALSE result (plus small number of forces computed for the
minimization to the new minimum). Thus the efficiency not only depends
on the number of forces computed by attempt, but also of the number of
fail attempts for certain topology. How to reduce these FALSE attempts
is still a challenge. The maximum number of attempts is set by the
variable `MAX_TRIALS_TO_CONVERGE` (by default is 10).