From e19ff7162fab8cfa5a09598f86ff87f36d7367ce Mon Sep 17 00:00:00 2001
From: PUVVADA BHASKAR <2400030295@kluniversity.in>
Date: Tue, 4 Nov 2025 15:19:21 +0530
Subject: [PATCH 1/2] Updated deduplication section in zfsconcepts.7 for
 clarity

---
 man/man7/zfsconcepts.7 | 49 +++++++++++++++++++++---------------------
 1 file changed, 25 insertions(+), 24 deletions(-)

diff --git a/man/man7/zfsconcepts.7 b/man/man7/zfsconcepts.7
index bb2178d85bcd..1671eedd0f01 100644
--- a/man/man7/zfsconcepts.7
+++ b/man/man7/zfsconcepts.7
@@ -181,32 +181,33 @@ See
 .Xr systemd.mount 5
 for details.
 .Ss Deduplication
-Deduplication is the process for removing redundant data at the block level,
-reducing the total amount of data stored.
-If a file system has the
+Deduplication is the process of eliminating redundant data blocks at the storage
+level, so that only one copy of each unique block is kept. When the
 .Sy dedup
-property enabled, duplicate data blocks are removed synchronously.
-The result
-is that only unique data is stored and common components are shared among files.
-.Pp
-Deduplicating data is a very resource-intensive operation.
-It is generally recommended that you have at least 1.25 GiB of RAM
-per 1 TiB of storage when you enable deduplication.
-Calculating the exact requirement depends heavily
-on the type of data stored in the pool.
-.Pp
-Enabling deduplication on an improperly-designed system can result in
-performance issues (slow I/O and administrative operations).
-It can potentially lead to problems importing a pool due to memory exhaustion.
-Deduplication can consume significant processing power (CPU) and memory as well
-as generate additional disk I/O.
-.Pp
-Before creating a pool with deduplication enabled, ensure that you have planned
-your hardware requirements appropriately and implemented appropriate recovery
-practices, such as regular backups.
-Consider using the
+property is enabled on a dataset, ZFS compares new data to existing blocks and
+stores references instead of duplicate copies.
+
+.Pp
+While this can reduce storage usage when large amounts of identical data exist,
+deduplication is a very resource-intensive feature. It maintains a
+deduplication table (DDT) in memory, which can grow significantly depending on
+the amount of stored data. As a general guideline, at least 1.25 GiB of RAM per
+1 TiB of pool storage is recommended, though the actual requirement varies with
+workload and data type.
+
+.Pp
+Enabling deduplication without sufficient system resources can lead to slow I/O,
+excessive memory and CPU use, and in extreme cases, difficulty importing the
+pool due to memory exhaustion. For these reasons, deduplication is not generally
+recommended unless there is a clear need for it—such as virtual machine images
+or backup datasets containing highly duplicated data.
+
+.Pp
+For most users, the
 .Sy compression
-property as a less resource-intensive alternative.
+property offers a more efficient and safer way to save space with far less
+performance impact. Always test and verify system performance before enabling
+deduplication in a production environment.
 .Ss Block cloning
 Block cloning is a facility that allows a file (or parts of a file) to be
 .Qq cloned ,

From 765ec9df6dfd11e56a98158ffb21e2d5f3767727 Mon Sep 17 00:00:00 2001
From: PUVVADA BHASKAR <2400030295@kluniversity.in>
Date: Tue, 4 Nov 2025 16:21:19 +0530
Subject: [PATCH 2/2] docs: clarify deduplication and add block cloning details
 in zfsconcepts.7

---
 man/man7/zfsconcepts.7 | 43 +++++++++++++++++++++---------------------
 1 file changed, 22 insertions(+), 21 deletions(-)

diff --git a/man/man7/zfsconcepts.7 b/man/man7/zfsconcepts.7
index 1671eedd0f01..afe925dd63c1 100644
--- a/man/man7/zfsconcepts.7
+++ b/man/man7/zfsconcepts.7
@@ -181,35 +181,36 @@ See
 .Xr systemd.mount 5
 for details.
 .Ss Deduplication
-Deduplication is the process of eliminating redundant data blocks at the storage
-level, so that only one copy of each unique block is kept. When the
+Deduplication is the process of eliminating redundant data blocks at the
+storage level so that only one copy of each unique block is kept.
+When the
 .Sy dedup
 property is enabled on a dataset, ZFS compares new data to existing blocks and
 stores references instead of duplicate copies.
-
 .Pp
 While this can reduce storage usage when large amounts of identical data exist,
-deduplication is a very resource-intensive feature. It maintains a
+deduplication is a very resource-intensive feature.
+It maintains a
 deduplication table (DDT) in memory, which can grow significantly depending on
-the amount of stored data. As a general guideline, at least 1.25 GiB of RAM per
-1 TiB of pool storage is recommended, though the actual requirement varies with
-workload and data type.
-
+the amount of stored data.
+As a general guideline, at least 1.25 GiB of RAM per 1 TiB of pool storage is
+recommended, though the actual requirement varies with workload and data type.
 .Pp
 Enabling deduplication without sufficient system resources can lead to slow I/O,
 excessive memory and CPU use, and in extreme cases, difficulty importing the
-pool due to memory exhaustion. For these reasons, deduplication is not generally
-recommended unless there is a clear need for it—such as virtual machine images
-or backup datasets containing highly duplicated data.
-
+pool due to memory exhaustion.
+For these reasons, deduplication is not generally recommended unless there is a
+clear need for it, such as virtual machine images or backup datasets containing
+highly duplicated data.
 .Pp
 For most users, the
 .Sy compression
 property offers a more efficient and safer way to save space with far less
-performance impact. Always test and verify system performance before enabling
-deduplication in a production environment.
+performance impact.
+Always test and verify system performance before enabling deduplication in a
+production environment.
 .Ss Block cloning
-Block cloning is a facility that allows a file (or parts of a file) to be
+Block cloning is a facility that allows a file, or parts of a file, to be
 .Qq cloned ,
 that is, a shallow copy made where the existing data blocks are referenced
 rather than copied.
@@ -224,8 +225,8 @@ Cloned blocks are tracked in a special on-disk structure called the Block
 Reference Table
 .Po BRT
 .Pc .
-Unlike deduplication, this table has minimal overhead, so can be enabled at all
-times.
+Unlike deduplication, this table has minimal overhead, so it can be enabled at
+all times.
 .Pp
 Also unlike deduplication, cloning must be requested by a user program.
 Many common file copying programs, including newer versions of
@@ -233,15 +234,15 @@ Many common file copying programs, including newer versions of
 will try to create clones automatically.
 Look for
 .Qq clone ,
-.Qq dedupe
+.Qq dedupe ,
 or
 .Qq reflink
 in the documentation for more information.
 .Pp
 There are some limitations to block cloning.
-Only whole blocks can be cloned, and blocks can not be cloned if they are not
-yet written to disk, or if they are encrypted, or the source and destination
+Only whole blocks can be cloned, and blocks cannot be cloned if they are not yet
+written to disk, or if they are encrypted, or if the source and destination
 .Sy recordsize
 properties differ.
-The OS may add additional restrictions;
+The operating system may add additional restrictions;
 for example, most versions of Linux will not allow clones across datasets.