|
| 1 | +--- |
| 2 | +date: 2025-07-20 |
| 3 | +title: What is the difference between OPTIMIZE FINAL and FINAL? |
| 4 | +tags: ['Core Data Concepts'] |
| 5 | +keywords: ['OPTIMIZE FINAL', 'FINAL'] |
| 6 | +description: 'Discusses the differences between OPTIMIZE FINAL and FINAL, and when to use and avoid them.' |
| 7 | +--- |
| 8 | + |
| 9 | +{frontMatter.description} |
| 10 | +{/* truncate */} |
| 11 | + |
| 12 | +# What is the difference between `OPTIMIZE FINAL` and `FINAL`? |
| 13 | + |
| 14 | +`OPTIMIZE FINAL` is a DDL command that physically and permanently reorganizes |
| 15 | +and optimizes data on disk. It physically merges data parts in `MergeTree` tables, |
| 16 | +performing data deduplication in the process by removing duplicate rows from storage. |
| 17 | + |
| 18 | +`FINAL` is a **query-time** modifier that provides deduplicated results without |
| 19 | +changing the structure of the stored data. It works by performing merge logic at |
| 20 | +read-time. It is temporary, only affecting the current query result. |
| 21 | + |
| 22 | +Users are often advised to avoid using `OPTIMIZE FINAL`, as it has a significant |
| 23 | +performance overhead, however they should not confuse the two. It is often necessary |
| 24 | +to use `FINAL` to get back results without duplicates, especially when using table |
| 25 | +engines like `ReplacingMergeTree` which may contain duplicate rows which have not |
| 26 | +been replaced during the eventual, background merge process. |
| 27 | + |
| 28 | +The table below summarises the key differences: |
| 29 | + |
| 30 | +|Aspect |`OPTIMIZE FINAL` | `FINAL` | |
| 31 | +|------------------|--------------------------------------------|----------------------------------------------------| |
| 32 | +|Type | DDL Command | Query Modifier | |
| 33 | +|Effect | Permanent storage optimization | Temporary query-time deduplication | |
| 34 | +|Performance | Impact High cost once, then faster queries | Lower individual cost, but repeated for each query | |
| 35 | +|Data Modification | Yes - physically changes storage | No - read-only operation | |
| 36 | +|Use Case | Periodic maintenance/optimization | Real-time deduplicated queries | |
| 37 | + |
| 38 | +## When to use each {#when-to-use-each} |
| 39 | + |
| 40 | +Use `OPTIMIZE FINAL` when: |
| 41 | + |
| 42 | +- You want to permanently improve query performance |
| 43 | +- You can afford the one-time optimization cost |
| 44 | +- You're doing periodic table maintenance |
| 45 | +- You want to physically clean up duplicate data |
| 46 | + |
| 47 | +Use `FINAL` when: |
| 48 | + |
| 49 | +- You need deduplicated results immediately |
| 50 | +- You can't wait for or don't want permanent optimization |
| 51 | +- You only occasionally need deduplicated data |
| 52 | +- You're working with frequently changing data |
| 53 | + |
| 54 | +Both are valuable tools, but they serve different purposes in ClickHouse's deduplication strategy. |
0 commit comments