Content-Defined Chunking (CDC) is an advanced data deduplication method that divides files into variable blocks or chunks, identified by their content rather than their fixed location or size.
This technique analyzes the content of data to determine the optimal cutting points, allowing for more precise identification and elimination of duplicates across a broader data set.
CDC deduplication enhances deduplication efficiency by dynamically adapting the size of the blocks to data patterns, which optimizes data transmission towards the backup space and significantly reduces the storage space needed for backups.
This deduplication technique is used by the latest generation of backup engines like Restic, Borg, ….