have a list of dependencies should probably be specified in a rather high level notation with wildcards and substitutions and stuff. E.g something like this measure=bytes_used -> measure=time_until_full This would match all series with measure=bytes_used and use them to compute a new series with measure=time_until_full and all other dimensions unchanged. Thats looks too simple, but do I need anything more complicated? Obviously I also need to specify the filter. And I may not even need to specify the result as the filter can determine that itself. Although that may be bad for reusability. Similar for auxiliary inputs. In the example above we also need the corresponding measure=bytes_usable timeseries. The filter can determine that itself, but maybe it's better to specify that in the rule? At run-time we expand the rules and just use the ids. I think we want to decouple the processing from the data aquisition, so the web service should just write the changed timeseries into a queue. Touch a file with the key as the name in a spool dir. The processor can then check if there is anything in the spool dir and process it. The result of the filter is then again added to the spool dir (make sure there are no circular dependencies! Hmm, that's up to the user I guess? Or each generated series could have a rate limit?) In addition to filters (which create new data) we also need some kind of alerting system. That could just be a filter which produces no data but does something else instead, like sending an email. So I'm not sure whether it makes sense to distinguish these. We should record all the used inputs (recursively) for each generated series (do we actually want to store the transitive closure or just the direct inputs? We can expand that when necessary.). Doing that for each datapoint is overkill, but we should mark each input with a "last seen" timestamp so that we can ignore or scrub inputs which are no longer used. Do we need a negative/timeout trigger? I.e. if a timeseries which is used as an input is NOT updated in time, trigger the filter anyway so that it can take appropriate action? If we have that how to we filter out obsolete measurements? We don't want to get alerted that we haven't gotten any disk usage data from a long-discarded host for 5 years. For now I think we rely on other still active checks to fail if a measurement fails to run.