Project

General

Profile

Actions

Architecture #11147

open

Partition archive tables in postgres, as well as reports table

Added by Nicolas CHARLES almost 7 years ago. Updated almost 4 years ago.

Status:
New
Priority:
N/A
Assignee:
-
Category:
Performance and scalability
Effort required:
Name check:
Fix check:
Regression:

Description

Archived table have a very high wolumetry, and very few read/write.
However we do once per day a bulk INSERT and a bulk DELETE

This works as expected, but indexes in these tables are larger than necessary, and DELETE/VACUUMing are fairly slow
Partitionning these tables ( cf https://www.postgresql.org/docs/9.2/static/ddl-partitioning.html ) would allow to have one table per day/week/period (select the right value), that could be dropped when archived reports are deleted, reclaiming instantaneously space on disk - or if it's not per day, at least we could vacuum on a much smaller set of data, and reclaiming space faster

Extra advantage: partition can be stored on other device, so a user could choose to keep years of reports

However, implementation can be tricky - by needing to create every day a new partition - however partition manager make it easy (see https://github.com/hipay/postgresql-partition-manager, https://github.com/dimitri/PartMgr , etc)

We could also create two partitions on ruddersysevents, one for all the repaired reports (that we query for list of changes), and one for all the control reports (start and end runs) to improve perfs on reportings

Final though:
The real change would probably be to have only ruddersysevents, with partitions, and skip completely the archived tables, and a clever partition scheme (to be defined), but out of my head, i would keep last 2 days in one partition, then one partition per day (drawback: don't know if it's easy to move data from one partition to another)
Why two days ? so that we don't have reportings issues at midnight

Actions

Also available in: Atom PDF