Actions
Architecture #17921
openimprove searching in ruddersysevents for the reports in store run agents
Pull Request:
Effort required:
Name check:
To do
Fix check:
To do
Regression:
Description
for the moment, we are doing a
select max(id) as id from RudderSysEvents where id > ${fromId} and executionTimeStamp < before
which is widly inefficient (see https://issues.rudder.io/issues/11147 )
then we do
select distinct T.nodeid, T.executiontimestamp, coalesce(C.keyvalue, '') as nodeconfigid, coalesce(C.iscomplete, false) as complete, T.insertionid from (select nodeid, executiontimestamp, min(id) as insertionid from ruddersysevents where id > ? and id <= ? group by nodeid, executiontimestamp) as T left join (select true as iscomplete, nodeid, executiontimestamp, keyvalue from ruddersysevents where id > ? and id <= ? and eventtype = 'control' and component = 'end' ) as C on T.nodeid = C.nodeid and T.executiontimestamp = C.executiontimestamp
agent run being out of order, we cannot simply take execution timestamp from last id as min execution time - risk is that the node is in the future, and prevent any compliance from a long time
we could have some heuristics:
- take the lowest time from the previous batch considered as the min time where to look from (so we need to store that), minus x minutes
- if the lowest is in the future from current time, take current time -x minutes?
- and even always ensure that there is a minimum from now to the min
of we could have the rudder-relay storing this information in reportsexecution, and skip the first part altogether, and then updating a table with the information of which runs have been updated, and from there have the webapp catchup on these and compute the compliance
Actions