49 lines
2.4 KiB
Plaintext
49 lines
2.4 KiB
Plaintext
|
have a list of dependencies
|
||
|
should probably be specified in a rather high level notation with
|
||
|
wildcards and substitutions and stuff.
|
||
|
|
||
|
E.g something like this
|
||
|
measure=bytes_used -> measure=time_until_full
|
||
|
This would match all series with measure=bytes_used and use them to
|
||
|
compute a new series with measure=time_until_full and all other
|
||
|
dimensions unchanged.
|
||
|
|
||
|
Thats looks too simple, but do I need anything more complicated?
|
||
|
Obviously I also need to specify the filter. And I may not even need
|
||
|
to specify the result as the filter can determine that itself.
|
||
|
Although that may be bad for reusability.
|
||
|
Similar for auxiliary inputs. In the example above we also need the
|
||
|
corresponding measure=bytes_usable timeseries. The filter can
|
||
|
determine that itself, but maybe it's better to specify that in the
|
||
|
rule?
|
||
|
|
||
|
At run-time we expand the rules and just use the ids.
|
||
|
|
||
|
I think we want to decouple the processing from the data aquisition, so
|
||
|
the web service should just write the changed timeseries into a queue.
|
||
|
Touch a file with the key as the name in a spool dir. The processor can
|
||
|
then check if there is anything in the spool dir and process it. The
|
||
|
result of the filter is then again added to the spool dir (make sure
|
||
|
there are no circular dependencies! Hmm, that's up to the user I guess?
|
||
|
Or each generated series could have a rate limit?)
|
||
|
|
||
|
In addition to filters (which create new data) we also need some kind of
|
||
|
alerting system. That could just be a filter which produces no data but
|
||
|
does something else instead, like sending an email. So I'm not sure
|
||
|
whether it makes sense to distinguish these.
|
||
|
|
||
|
We should record all the used inputs (recursively) for each generated
|
||
|
series (do we actually want to store the transitive closure or just the
|
||
|
direct inputs? We can expand that when necessary.). Doing that for each
|
||
|
datapoint is overkill, but we should mark each input with a "last seen"
|
||
|
timestamp so that we can ignore or scrub inputs which are no longer
|
||
|
used.
|
||
|
|
||
|
Do we need a negative/timeout trigger? I.e. if a timeseries which is
|
||
|
used as an input is NOT updated in time, trigger the filter anyway so
|
||
|
that it can take appropriate action? If we have that how to we filter
|
||
|
out obsolete measurements? We don't want to get alerted that we haven't
|
||
|
gotten any disk usage data from a long-discarded host for 5 years. For
|
||
|
now I think we rely on other still active checks to fail if a
|
||
|
measurement fails to run.
|