Docs: Add newline to fix lists (#9664)

manuzhang · web-flow · commit 396a8441c415 · 2024-02-06T13:21:57.000+01:00
diff --git a/docs/docs/configuration.md b/docs/docs/configuration.md
@@ -157,6 +157,7 @@ Here are the catalog properties related to locking. They are used by some catalo
 
 The following properties from the Hadoop configuration are used by the Hive Metastore connector.
 The HMS table locking is a 2-step process:
+
 1. Lock Creation: Create lock in HMS and queue for acquisition
 2. Lock Check: Check if lock successfully acquired
 
@@ -180,6 +181,7 @@ Hive Metastore before the lock is retried from Iceberg.
 
 Warn: Setting `iceberg.engine.hive.lock-enabled`=`false` will cause HiveCatalog to commit to tables without using Hive locks.
 This should only be set to `false` if all following conditions are met:
+
  - [HIVE-26882](https://issues.apache.org/jira/browse/HIVE-26882)
 is available on the Hive Metastore server
  - All other HiveCatalogs committing to tables that this HiveCatalog commits to are also on Iceberg 1.3 or later
diff --git a/docs/docs/delta-lake-migration.md b/docs/docs/delta-lake-migration.md
@@ -36,6 +36,7 @@ The `iceberg-delta-lake` module is not bundled with Spark and Flink engine runti
 
 ### Compatibilities
 The module is built and tested with `Delta Standalone:0.6.0` and supports Delta Lake tables with the following protocol version:
+
 * `minReaderVersion`: 1
 * `minWriterVersion`: 2
 
@@ -44,6 +45,7 @@ Please refer to [Delta Lake Table Protocol Versioning](https://docs.delta.io/lat
 ### API
 The `iceberg-delta-lake` module provides an interface named `DeltaLakeToIcebergMigrationActionsProvider`, which contains actions that helps converting from Delta Lake to Iceberg.
 The supported actions are:
+
 * `snapshotDeltaLakeTable`: snapshot an existing Delta Lake table to an Iceberg table
 
 ### Default Implementation
diff --git a/docs/docs/hive.md b/docs/docs/hive.md
@@ -459,12 +459,14 @@ ALTER TABLE t set TBLPROPERTIES ('metadata_location'='<path>/hivemetadata/00003-
 
 ### SELECT
 Select statements work the same on Iceberg tables in Hive. You will see the Iceberg benefits over Hive in compilation and execution:
+
 * **No file system listings** - especially important on blob stores, like S3
 * **No partition listing from** the Metastore
 * **Advanced partition filtering** - the partition keys are not needed in the queries when they could be calculated
 * Could handle **higher number of partitions** than normal Hive tables
 
 Here are the features highlights for Iceberg Hive read support:
+
 1. **Predicate pushdown**: Pushdown of the Hive SQL `WHERE` clause has been implemented so that these filters are used at the Iceberg `TableScan` level as well as by the Parquet and ORC Readers.
 2. **Column projection**: Columns from the Hive SQL `SELECT` clause are projected down to the Iceberg readers to reduce the number of columns read.
 3. **Hive query engines**:
diff --git a/docs/docs/metrics-reporting.md b/docs/docs/metrics-reporting.md
@@ -26,6 +26,7 @@ As of 1.1.0 Iceberg supports the [`MetricsReporter`](../../javadoc/{{ icebergVer
 
 ### ScanReport
 A [`ScanReport`](../../javadoc/{{ icebergVersion }}/org/apache/iceberg/metrics/ScanReport.html) carries metrics being collected during scan planning against a given table. Amongst some general information about the involved table, such as the snapshot id or the table name, it includes metrics like:
+
 * total scan planning duration
 * number of data/delete files included in the result
 * number of data/delete manifests scanned/skipped
@@ -35,6 +36,7 @@ A [`ScanReport`](../../javadoc/{{ icebergVersion }}/org/apache/iceberg/metrics/S
 
 ### CommitReport
 A [`CommitReport`](../../javadoc/{{ icebergVersion }}/org/apache/iceberg/metrics/CommitReport.html) carries metrics being collected after committing changes to a table (aka producing a snapshot). Amongst some general information about the involved table, such as the snapshot id or the table name, it includes metrics like:
+
 * total duration
 * number of attempts required for the commit to succeed
 * number of added/removed data/delete files
diff --git a/docs/docs/spark-procedures.md b/docs/docs/spark-procedures.md
@@ -459,6 +459,7 @@ CALL catalog_name.system.rewrite_manifests('db.sample', false);
 ### `rewrite_position_delete_files`
 
 Iceberg can rewrite position delete files, which serves two purposes:
+
 * Minor Compaction: Compact small position delete files into larger ones.  This reduces the size of metadata stored in manifest files and overhead of opening small delete files.
 * Remove Dangling Deletes: Filter out position delete records that refer to data files that are no longer live.  After rewrite_data_files, position delete records pointing to the rewritten data files are not always marked for removal, and can remain tracked by the table's live snapshot metadata.  This is known as the 'dangling delete' problem.
 
@@ -760,6 +761,7 @@ Creates a view that contains the changes from a given table.
 | `identifier_columns` |           | array<string>       | The list of identifier columns to compute updates. If the argument `compute_updates` is set to true and `identifier_columns` are not provided, the table’s current identifier fields will be used.   |
 
 Here is a list of commonly used Spark read options:
+
 * `start-snapshot-id`: the exclusive start snapshot ID. If not provided, it reads from the table’s first snapshot inclusively. 
 * `end-snapshot-id`: the inclusive end snapshot id, default to table's current snapshot.                                                                                                                                            
 * `start-timestamp`: the exclusive start timestamp. If not provided, it reads from the table’s first snapshot inclusively.
@@ -807,6 +809,7 @@ SELECT * FROM tbl_changes where _change_type = 'INSERT' AND id = 3 ORDER BY _cha
 ``` 
 Please note that the changelog view includes Change Data Capture(CDC) metadata columns
 that provide additional information about the changes being tracked. These columns are:
+
 - `_change_type`: the type of change. It has one of the following values: `INSERT`, `DELETE`, `UPDATE_BEFORE`, or `UPDATE_AFTER`.
 - `_change_ordinal`: the order of changes
 - `_commit_snapshot_id`: the snapshot ID where the change occurred
diff --git a/docs/docs/spark-queries.md b/docs/docs/spark-queries.md
@@ -295,11 +295,11 @@ SELECT * FROM prod.db.table.files;
 | 1 | s3:/.../table/data/00081-4-a9aa8b24-20bc-4d56-93b0-6b7675782bb5-00001-deletes.parquet | PARQUET | 0 | 1 | 1560 | {2147483545:46,2147483546:152} | {2147483545:1,2147483546:1} | {2147483545:0,2147483546:0} | {} | {2147483545:,2147483546:s3:/.../table/data/00000-0-f9709213-22ca-4196-8733-5cb15d2afeb9-00001.parquet} | {2147483545:,2147483546:s3:/.../table/data/00000-0-f9709213-22ca-4196-8733-5cb15d2afeb9-00001.parquet} | NULL | [4] | NULL | NULL | {"data":{"column_size":null,"value_count":null,"null_value_count":null,"nan_value_count":null,"lower_bound":null,"upper_bound":null},"id":{"column_size":null,"value_count":null,"null_value_count":null,"nan_value_count":null,"lower_bound":null,"upper_bound":null}} |
 | 2 | s3:/.../table/data/00047-25-833044d0-127b-415c-b874-038a4f978c29-00612.parquet | PARQUET | 0 | 126506 | 28613985 | {100:135377,101:11314} | {100:126506,101:126506} | {100:105434,101:11} | {} | {100:0,101:17} | {100:404455227527,101:23} | NULL | NULL | [1] | 0 | {"id":{"column_size":135377,"value_count":126506,"null_value_count":105434,"nan_value_count":null,"lower_bound":0,"upper_bound":404455227527},"data":{"column_size":11314,"value_count":126506,"null_value_count": 11,"nan_value_count":null,"lower_bound":17,"upper_bound":23}} |
 
-!!!info
-   Content refers to type of content stored by the data file:
-     0  Data
-     1  Position Deletes
-     2  Equality Deletes
+!!! info
+    Content refers to type of content stored by the data file:
+      * 0  Data
+      * 1  Position Deletes
+      * 2  Equality Deletes
 
 To show only data files or delete files, query `prod.db.table.data_files` and `prod.db.table.delete_files` respectively.
 To show all files, data files and delete files across all tracked snapshots, query `prod.db.table.all_files`, `prod.db.table.all_data_files` and `prod.db.table.all_delete_files` respectively.
@@ -317,6 +317,7 @@ SELECT * FROM prod.db.table.manifests;
 | s3://.../table/metadata/45b5290b-ee61-4788-b324-b1e2735c0e10-m0.avro | 4479   | 0                 | 6668963634911763636 | 8                      | 0                         | 0                        | [[false,null,2019-05-13,2019-05-15]] |
 
 Note:
+
 1. Fields within `partition_summaries` column of the manifests table correspond to `field_summary` structs within [manifest list](../../spec.md#manifest-lists), with the following order:
     - `contains_null`
     - `contains_nan`
@@ -341,6 +342,7 @@ SELECT * FROM prod.db.table.partitions;
 | {20211002, 10} | 0       | 3             | 2          | 400                      | 0                            | 0                          | 1                            | 1                          | 1633169159489000    | 6941468797545315876      |
 
 Note:
+
 1. For unpartitioned tables, the partitions table will not contain the partition and spec_id fields.
 
 2. The partitions metadata table shows partitions with data files or delete files in the current snapshot. However, delete files are not applied, and so in some cases partitions may be shown even though all their data rows are marked deleted by delete files.
@@ -416,6 +418,7 @@ SELECT * FROM prod.db.table.all_manifests;
 | s3://.../metadata/a85f78c5-3222-4b37-b7e4-faf944425d48-m0.avro | 6376 | 0 | 6272782676904868561 | 2 | 0 | 0 |[{false, false, 20210101, 20210101}]|
 
 Note:
+
 1. Fields within `partition_summaries` column of the manifests table correspond to `field_summary` structs within [manifest list](../../spec.md#manifest-lists), with the following order:
     - `contains_null`
     - `contains_nan`
diff --git a/docs/docs/spark-writes.md b/docs/docs/spark-writes.md
@@ -310,11 +310,11 @@ While inserting or updating Iceberg is capable of resolving schema mismatch at r
 
 * A new column is present in the source but not in the target table.
     
-  The new column is added to the target table. Column values are set to `NULL` in all the rows already present in the table
+    The new column is added to the target table. Column values are set to `NULL` in all the rows already present in the table
 
 * A column is present in the target but not in the source. 
 
-  The target column value is set to `NULL` when inserting or left unchanged when updating the row.
+    The target column value is set to `NULL` when inserting or left unchanged when updating the row.
 
 The target table must be configured to accept any schema change by setting the property `write.spark.accept-any-schema` to `true`.
 
diff --git a/site/docs/how-to-release.md b/site/docs/how-to-release.md
@@ -267,7 +267,7 @@ svn ci -m 'Iceberg: Add release <VERSION>'
 ```
 
 !!! Note
-The above step requires PMC privileges to execute.
+    The above step requires PMC privileges to execute.
 
 Next, add a release tag to the git repository based on the passing candidate tag:
 
@@ -472,7 +472,7 @@ repositories {
 ```
 
 !!! Note
-Replace `${MAVEN_URL}` with the URL provided in the release announcement
+    Replace `${MAVEN_URL}` with the URL provided in the release announcement
 
 ### Verifying with Spark
 
diff --git a/site/docs/spec.md b/site/docs/spec.md
@@ -222,6 +222,7 @@ Any struct, including a top-level schema, can evolve through deleting fields, ad
 Grouping a subset of a struct’s fields into a nested struct is **not** allowed, nor is moving fields from a nested struct into its immediate parent struct (`struct<a, b, c> ↔ struct<a, struct<b, c>>`). Evolving primitive types to structs is **not** allowed, nor is evolving a single-field struct to a primitive (`map<string, int> ↔ map<string, struct<int>>`).
 
 Struct evolution requires the following rules for default values:
+
 * The `initial-default` must be set when a field is added and cannot change
 * The `write-default` must be set when a field is added and may change
 * When a required field is added, both defaults must be set to a non-null value
@@ -1217,6 +1218,7 @@ This serialization scheme is for storing single values as individual binary valu
 ### Version 3
 
 Default values are added to struct fields in v3.
+
 * The `write-default` is a forward-compatible change because it is only used at write time. Old writers will fail because the field is missing.
 * Tables with `initial-default` will be read correctly by older readers if `initial-default` is always null for optional fields. Otherwise, old readers will default optional columns with null. Old readers will fail to read required fields which are populated by `initial-default` because that default is not supported.
 
diff --git a/site/docs/view-spec.md b/site/docs/view-spec.md
@@ -65,6 +65,7 @@ The view version metadata file has the following fields:
 | _optional_  | `properties`         | A string to string map of view properties [2] |
 
 Notes:
+
 1. The number of versions to retain is controlled by the table property: `version.history.num-entries`.
 2. Properties are used for metadata such as `comment` and for settings that affect view maintenance. This is not intended to be used for arbitrary metadata.
 
@@ -103,6 +104,7 @@ A view version can have more than one representation. All representations for a
 View versions are immutable. Once a version is created, it cannot be changed. This means that representations for a version cannot be changed. If a view definition changes (or new representations are to be added), a new version must be created.
 
 Each representation is an object with at least one common field, `type`, that is one of the following:
+
 * `sql`: a SQL SELECT statement that defines the view
 
 Representations further define metadata for each type.