ltsdb/doc/processing.pipe

49 lines
2.4 KiB
Plaintext
Raw Normal View History

have a list of dependencies
should probably be specified in a rather high level notation with
wildcards and substitutions and stuff.
E.g something like this
measure=bytes_used -> measure=time_until_full
This would match all series with measure=bytes_used and use them to
compute a new series with measure=time_until_full and all other
dimensions unchanged.
Thats looks too simple, but do I need anything more complicated?
Obviously I also need to specify the filter. And I may not even need
to specify the result as the filter can determine that itself.
Although that may be bad for reusability.
Similar for auxiliary inputs. In the example above we also need the
corresponding measure=bytes_usable timeseries. The filter can
determine that itself, but maybe it's better to specify that in the
rule?
At run-time we expand the rules and just use the ids.
I think we want to decouple the processing from the data aquisition, so
the web service should just write the changed timeseries into a queue.
Touch a file with the key as the name in a spool dir. The processor can
then check if there is anything in the spool dir and process it. The
result of the filter is then again added to the spool dir (make sure
there are no circular dependencies! Hmm, that's up to the user I guess?
Or each generated series could have a rate limit?)
In addition to filters (which create new data) we also need some kind of
alerting system. That could just be a filter which produces no data but
does something else instead, like sending an email. So I'm not sure
whether it makes sense to distinguish these.
We should record all the used inputs (recursively) for each generated
series (do we actually want to store the transitive closure or just the
direct inputs? We can expand that when necessary.). Doing that for each
datapoint is overkill, but we should mark each input with a "last seen"
timestamp so that we can ignore or scrub inputs which are no longer
used.
Do we need a negative/timeout trigger? I.e. if a timeseries which is
used as an input is NOT updated in time, trigger the filter anyway so
that it can take appropriate action? If we have that how to we filter
out obsolete measurements? We don't want to get alerted that we haven't
gotten any disk usage data from a long-discarded host for 5 years. For
now I think we rely on other still active checks to fail if a
measurement fails to run.