Random timeline data – depends on previous data

Non-random pseudo-factories such as iterator can be used to generate time series data as follows:

>>> from datetime import datetime, timedelta
>>> import randog.factory

>>> def iter_datetime(start: datetime, step: timedelta):
...     nxt = start
...     while True:
...         yield nxt
...         nxt += step

>>> def iter_randomwalk(start: int = 0, step: int = 1):
...     step_f = randog.factory.randchoice(-step, step)
...     nxt = start
...     while True:
...         yield nxt
...         nxt += step_f.next()

>>> factory = randog.factory.from_example({
...     "smpl_datetime": iter_datetime(datetime(2022, 1, 1, 12), timedelta(hours=1)),
...     "location": iter_randomwalk(),
... })

>>> # hourly timeline from 2022-01-01T12:00:00
>>> timeline = list(factory.iter(200))

The timeline that can be generated in this example, for example, would look like this:

[
    {"smpl_datetime": datetime(2022, 1, 1, 12, 0), "location": 0},
    {"smpl_datetime": datetime(2022, 1, 1, 13, 0), "location": 1},
    {"smpl_datetime": datetime(2022, 1, 1, 14, 0), "location": 0},
    {"smpl_datetime": datetime(2022, 1, 1, 15, 0), "location": -1},
    {"smpl_datetime": datetime(2022, 1, 1, 16, 0), "location": -2},
    ...
]

As can be seen from the definition of iter_datetime, the value of smpl_datetime is not random, but increases by exactly one hour. Also, the value of location is random but the difference from the previous value is 1; it is randomwalk. Thus, the iterator can be used to create a factory that generates values dependent on the previous value.

Note

If you want to add an auto-incremental field, you can use increment. See also: Incremental integer factory.

Change the type of smpl_datetime to str

In the above case, a dict with the element example as it is was given to from_example as an example to create a factory. If you want to use methods of factories of elements, use by_iterable to create the elements’ factory. The following example uses post_process to make smpl_datetime a string.

>>> from datetime import datetime, timedelta
>>> import randog.factory

>>> def iter_datetime(start: datetime, step: timedelta):
...     nxt = start
...     while True:
...         yield nxt
...         nxt += step

>>> def iter_randomwalk(start: int = 0, step: int = 1):
...     step_f = randog.factory.randchoice(-step, step)
...     nxt = start
...     while True:
...         yield nxt
...         nxt += step_f.next()

>>> factory = randog.factory.from_example({
...     "smpl_datetime": randog.factory.by_iterator(
...         iter_datetime(datetime(2022, 1, 1, 12), timedelta(hours=1))
...     ).post_process(lambda d: d.isoformat()),
...     "location": iter_randomwalk(),
... })

>>> # hourly timeline from 2022-01-01T12:00:00
>>> timeline = list(factory.iter(200))
>>> timeline[0]
{'smpl_datetime': '2022-01-01T12:00:00', 'location': 0}