multi_locus_analysis.finite_window.simulation

multi_locus_analysis.finite_window.simulation.ab_window(rands, window_size, offset, num_replicates=1, states=[0, 1], seed=None, random_state=None)[source]

Simulate an asynchronous two-state system from time 0 to window_size.

Similar to multi_locus_analysis.finite_window.ab_window_fast(), but designed to work when the means of the distributions being used are hard to calculate.

Simulate asynchronicity by starting the simulation in a uniformly random state at a time \(-t_\infty\) (a large negative number).

Note

This number must be specified in the offset parameter and if it is not much larger than the means of the waiting times being used, the asynchronicity approximation will be very poor.

The simulation only records between times 0 and window_size.

Parameters
  • rands ((2,) List[scipy.stats.rv_continuous]) – Callable that takes “random_state” and “size” kwargs that accept a np.random.RandomState seed and a tuple specifying array sizes, resp.

  • window_size (float) – The width of the window over which the observation takes place

  • offset (float) – The (negative) time at which to start (in state 0) in order to equilibrate the simulation state by time t=0.

  • states ((2,) array_like) – the “names” of each state, default to [0,1]

  • num_replicates (int) – Number of times to run the simulation, default 1.

  • seed (Optional[int]) – Random seed to start the simulation with

  • random_state (np.random.RandomState) – Random state to start the simulation with. Preempts the seed argument.

Returns

df – The start/end times of each waiting time simulated. This data frame has columns=[‘replicate’, ‘state’, ‘start_time’, ‘end_time’, ‘window_start’, ‘window_end’].

Return type

pd.DataFrame

multi_locus_analysis.finite_window.simulation.ab_window_fast(rands, means, window_size, num_replicates=1, states=[0, 1], seed=None, random_state=None)[source]

WARNING: BUGGY! Needs to use t*f(t) for first time. Doesn’t.

Simulate a two-state system switching between states A and B.

In addition to functions that can generate random waiting times for each state, this “fast” version of the code requires the average waiting times are are means[0], means[1], respectively.

Warning

apparently, bruno_util.random.strong_default_seed() is broken (or this function is) because passing a seed does not make the output reproducible.

Parameters
  • rands ((2,) List[scipy.stats.rv_continuous]) – One of the random variables defined in scipy.stats. Alternatively, any callable that takes random_state and size kwargs. random_state should accept a np.random.RandomState seed. size will be a tuple specifying output shape of random number array requested.

  • means ((2,) array_like) – average waiting times for each of the states

  • window_size (float) – the width of the window over which the observation takes place

  • num_replicates (int) – number of times to run the simulation, default to 1

  • states ((2,) array_like) – the “names” of each state, default to [0,1]

  • seed (np.random.RandomState) – state to start the simulation with

Returns

df – The start/end times of each waiting time simulated. This data frame has columns=[‘replicate’, ‘state’, ‘start_time’, ‘end_time’, ‘window_start’, ‘window_end’].

Return type

pd.DataFrame

Notes

Consider the waiting time intersecting the left boundary of the observation window. The left boundary will be a uniform fraction of the way through this wait time. This can easily be seen in the case of finite-variance wait times using CLT and starting the switching process arbitrarily far left of the window of observation, or in general be imposed by requiring time-homogeneity of the experiment.

We use this fact here to speed up correct simulation of time-homogenous windows by directly simulating only the waiting times within the windows instead of also simulating a long run of “pre-equilibrating” waiting times some offset before the window, as in ab_window().