Option to dilute unwritten metrics rather than drop the oldest #16568
Labels
feature request
Requests for new plugin and for new features to existing plugins
waiting for response
waiting for response from contributor
Use Case
Be allowed to prioritise keeping lower interval metrics over losing older metrics during output downtime.
This would be something you would configure in the
telegraf.conf
, along withmetric_buffer_limit
and suchlike.One use case for this would be a system which is monitoring the health of a number of other systems. Here it would seem reasonable that 8 days of system health statuses every 80 seconds may provide more valuable insights than 1 day of statuses every 10 seconds.
Expected behavior
Say there is output downtime (e.g. the InfluxDB database can't be reached). If the
metric_buffer_limit
is reached, the first of every 2 consecutive datapoints are dropped.For example:
For someone analysing graphs based on this data, they would encounter a lower frequency of data points throughout the entire downtime period. However, more recent data collected during the downtime would be available at a frequency twice as high as the rest of the downtime, depending on when the downtime concluded.
Actual behavior
Currently the behaviour is just to drop the oldest metrics (I believe).
Additional info
Potential issues in implementation may occur due to InfluxDB expecting data of a certain interval, but I'm unsure if this is the case.
A feature that could build of of this would be allowing the user to choose to dilute starting at the most recent data, to keep the more frequent data at the point of the downtime starting. This would require the interval doubling to occur as soon as the limit is hit.
The text was updated successfully, but these errors were encountered: