Details about using the advanced sampling to find new states in the system as fast as possible.

## MR121-GSGSW

We used the setup with 34 states, which were originally from a system of 51 metastable states and thus suffers from the imbalance in the number of simulations.

Parameter |
Value/Comment |

`Target Property` |
maximize the probability to find a transition to a new existing, but not yet explored state |

`Iterations` |
1000 maximum, although this number was rarely reached during the simulation |

`Simulations` |
1000 for each simulation Type |

`SimulationType` |
Adaptive, Continous, Random, BestCase, WorstCase |

## Results

### Comparison BEST (green) / WORST (red) / ADAPTIVE (black)

### Comparison CONT (green) / RANDOM (red) / ADAPTIVE (black)

The plot shows the average number of states found and the standard deviation for the Bestcase, Worstcase and the Adaptive Case. Resp. for the other set of simulations.
On first sight all plots look very similar, especially in the first part (about first 150 iterations), after that, we recognize the
tendency of the adaptive on, to be slightly higher in average and also in the upper bound, which means, that it is more likely to find all states sooner, than in the other two cases. Amazingly it even outperforms the bestcase, which is somehow strange.
Another bad point is to note, that the continous and the random run is still better than the adaptive one. This is also wrong, which leads to the conclusion that either there is a problem in the simulation, that the "obviuos" convergence of the number of simulations is misleading, althoug the functions look very smooth, or, that boundar effects, which cause the system to stop, if for example starting state 1 is choosen more than once. This problem for example causes the trajectories to be shorter when closing to 34 states. So that the number of long trajectories is small ans thus the statistics in this region is bad. In the region far apart the full number of states, the diffrence is marginal...
Also, the approximation by a gaussian may be misleading and causing wrong results...

#### Conclusions

Since the number of trajectories that survives decreases with the number of iterations, the statistics becomes worse in this region. But this is not the only problem. Since we use only the trajectories, that have survived as representatives in our statistics, the result is of course biased by this fact.
It may turn out, that especially good runs, will have to start in states, where we have only few trajectories and thus the result we will shifted to the bad side. This is an important issue and we will have to run the simulations again in the 51state case.

If we look only at the short time behaviour we can hardly see a difference. This is probably due to the fact, that the graph is strongly interconnected, in the sense, that most pairs of node have a finite transition probability.
This makes it easy for the system to explore quickly a lot of states, regardless of the target property used to chose initial states. This may be caused by a too long lagtime or expressed differently problems to define a good markov model.
In a connected system, each state can be reached in a finite number of steps, thus when we choose the lagtime too large, the transition matrix will be dense and we will loose to topology in the transition matrix; all entries will be larger than zero. The ideal case would be a generator, which can exactly catch these properties, but is hard to find. We will have to investigate further in this issue

## Virtual Linear Chain Model 34states

Apparently our last attempt was not successful in achieving a faster exploration of the phase-space. Therefore we designed a linear chain of 34 states, where a single state has two neighbours at most. We start the exploration at one end of the chain.

Parameter |
Value/Comment |

`Target Property` |
maximize the probability to find a transition to a new existing, but not yet explored state |

`Iterations` |
1000 for each simulation |

`Simulations` |
1000 for each simulation Type |

`SimulationType` |
Adaptive, Continous, Random, BestCase |

## Results

### Comparison BEST (blue) / CONTINUOUS (green) / RANDOM (red) / ADAPTIVE (black)

In this case the result is much better. One can clearly see, that the adaptive case works much faster towards the maximal number of states then the continuous or random case. The Best Case is of course much faster, but uses information, that is normally not available during a simulation. It only shows, what is theoretically possible if one knows the system in advance.

#### Best / Worst Case

To have some runs of reference, we run a set of simulation, where we picked always the state, that had (according to the
reference transition matrix) the highest probability for a new state to be found. For this we were summing over all entries in a
row of the states, that have not been visited yet. The row with the highest probability to jump to an unknown states was chosen
as the bestcase. Vice versa, the row with the lowest probability for the worstcase

#### Comments