Random timeline data – depends on previous data
Non-random pseudo-factories such as iterator can be used to generate time series data as follows:
>>> from datetime import datetime, timedelta
>>> import randog.factory
>>> def iter_datetime(start: datetime, step: timedelta):
... nxt = start
... while True:
... yield nxt
... nxt += step
>>> def iter_randomwalk(start: int = 0, step: int = 1):
... step_f = randog.factory.randchoice(-step, step)
... nxt = start
... while True:
... yield nxt
... nxt += step_f.next()
>>> factory = randog.factory.from_example({
... "smpl_datetime": iter_datetime(datetime(2022, 1, 1, 12), timedelta(hours=1)),
... "location": iter_randomwalk(),
... })
>>> # hourly timeline from 2022-01-01T12:00:00
>>> timeline = list(factory.iter(200))
The timeline that can be generated in this example, for example, would look like this:
[
{"smpl_datetime": datetime(2022, 1, 1, 12, 0), "location": 0},
{"smpl_datetime": datetime(2022, 1, 1, 13, 0), "location": 1},
{"smpl_datetime": datetime(2022, 1, 1, 14, 0), "location": 0},
{"smpl_datetime": datetime(2022, 1, 1, 15, 0), "location": -1},
{"smpl_datetime": datetime(2022, 1, 1, 16, 0), "location": -2},
...
]
As can be seen from the definition of iter_datetime
, the value of smpl_datetime
is not random, but increases by exactly one hour. Also, the value of location
is random but the difference from the previous value is 1; it is randomwalk. Thus, the iterator can be used to create a factory that generates values dependent on the previous value.
Note
If you want to add an auto-incremental field, you can use increment. See also: Incremental integer factory.
Change the type of smpl_datetime
to str
In the above case, a dict with the element example as it is was given to from_example
as an example to create a factory. If you want to use methods of factories of elements, use by_iterable
to create the elements’ factory. The following example uses post_process
to make smpl_datetime
a string.
>>> from datetime import datetime, timedelta
>>> import randog.factory
>>> def iter_datetime(start: datetime, step: timedelta):
... nxt = start
... while True:
... yield nxt
... nxt += step
>>> def iter_randomwalk(start: int = 0, step: int = 1):
... step_f = randog.factory.randchoice(-step, step)
... nxt = start
... while True:
... yield nxt
... nxt += step_f.next()
>>> factory = randog.factory.from_example({
... "smpl_datetime": randog.factory.by_iterator(
... iter_datetime(datetime(2022, 1, 1, 12), timedelta(hours=1))
... ).post_process(lambda d: d.isoformat()),
... "location": iter_randomwalk(),
... })
>>> # hourly timeline from 2022-01-01T12:00:00
>>> timeline = list(factory.iter(200))
>>> timeline[0]
{'smpl_datetime': '2022-01-01T12:00:00', 'location': 0}