From 0ef9fdf106645323b82007cf376d2ddf09785031 Mon Sep 17 00:00:00 2001
From: Sebastian Urchs <surchs@users.noreply.github.com>
Date: Fri, 30 May 2025 14:48:01 +0200
Subject: [PATCH 01/53] [ENH] Integrate BEP036 - Phenotypic Data Guidelines

BEP036 brings guidelines for best tabular phenotypic data to the BIDS specification.

- Includes an appendix called `phenotype.md`
- Includes admonitions for the guidelines in-line with modality agnostic files sections

---------

Co-authored-by: Eric Earl <eric.earl@nih.gov>
Co-authored-by: Samuel Guay <samuel.guay@umontreal.ca>
Co-authored-by: Sebastian Urchs <sebastian.urchs@mcgill.ca>
Co-authored-by: Arshitha B <arshitha.basavaraj@iiitb.ac.in>
---
 src/appendices/phenotype.md                   | 331 ++++++++++++++++++
 src/common-principles.md                      |   8 +-
 .../data-summary-files.md                     | 286 +++++++++++++--
 src/schema/objects/files.yaml                 |   7 +-
 4 files changed, 602 insertions(+), 30 deletions(-)
 create mode 100644 src/appendices/phenotype.md
diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
new file mode 100644
index 0000000000..53afb47206
--- /dev/null
+++ b/src/appendices/phenotype.md
@@ -0,0 +1,331 @@
+# Tabular phenotypic data guidelines
+
+This appendix is a collection of guidelines and examples for creating well-organized aggregated tabular phenotypic data.
+
+## Guidelines
+
+These guidelines are all **RECOMMENDED** when preparing
+tabular phenotypic data like the
+participants file, sessions file, demographics file,
+or phenotypic and assessment data.
+The language below uses REQUIRED, MUST, and others to imply
+these are the requirements for these **RECOMMENDED** guidelines.
+
+### 1. Always pair tabular data with data dictionaries
+
+Tabular phenotypic data MUST be prepared as one pair of a tabular file
+in tab-separated value (TSV) format and a corresponding data dictionary
+in JavaScript Object Notation (JSON) format.
+
+### 2. Aggregate data across sessions
+
+Aggregation refers to the contents of the TSV file. It is REQUIRED
+to collect all participant data into one TSV per tabular phenotypic file.
+
+### 3. Ensure minimal annotation for phenotypic and assessment data
+
+In phenotypic and assessment data each measurement tool has an independent
+aggregated data TSV file in which the user collects all subjects, sessions,
+and/or runs of data as one entry per row (with a row defined by
+the smallest unit of acquisition). In other words:
+
+1. Each row MUST start with `participant_id`.
+2. Each TSV file MUST contain a `session_id` column when
+multiple [sessions](../glossary.md#session-entities)[^1] are present
+in the data set regardless of whether those sessions are in
+the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
+3. If more than one of the same measurement tool is acquired within
+the same `session_id`, a `run` column MUST be added.
+4. To encode the acquisition time for a measurement tool’s `session_id`,
+add the `session_id` to the sessions file and
+include the OPTIONAL `acq_time` column.
+
+To summarize this guideline as a table:
+
+| **Column name**  | **Requirement** | **Description** |
+| :--------------- | :-------------- | :-------------- |
+| `participant_id` | REQUIRED        | MUST be the first column in the file.   Note that data for one participant MAY be represented across multiple rows in case of multiple sessions or runs, and therefore the entry in the `participant_id` column will be repeated. |
+| `session _id`    | CONDITIONAL ; If sessions are defined in the dataset | A `session_id` column MUST be added to all tabular files in the phenotype directory as soon as multiple sessions are present in the data set regardless of whether those sessions are in the  `phenotype/` data, `sub-<label>/` data, or a combination of the two. |
+| `run`            | CONDITIONAL ; If there are multiple runs within any session | A chronological `run` number is used when a measurement tool or assessment described by a tabular file was repeated within a session. |
+| `acq_time`       | OPTIONAL        | If acquisition time is available, the `acq_time` column CAN be used to record the time of acquisition of each row in the tabular file. |
+
+Furthermore, if you have to add a `session_id` column to the
+tabular phenotypic data, you then MUST also introduce a session directory to the
+imaging data, even if only one imaging session has been created.
+This rule can be considered as "**if anyone uses sessions, everyone uses sessions**."
+And vice versa, if imaging data has session directories,
+all imaging data and tabular phenotypic data MUST have sessions.
+
+This produces a file in which same-participant entries can take up as many rows
+as needed according to the smallest unit of acquisition.
+The combination of values in the `participant_id`, `session_id`, and `run` (if present)
+columns MUST be unique for the entire tabular file.
+
+### 4. Add `MeasurementToolMetadata` to each tabular phenotypic measurment tool
+
+Whenever possible, it is RECOMMENDED to add `MeasurementToolMetadata` to
+each `phenotype/<measurement_tool_name>.json` data dictionary.
+This improves reusability and provides clarity about the measurement tool.
+
+### 5. Use the demographics file for common variables about participants
+
+Some studies collect demographics into their own tabular phenotypic data file already.
+In these cases, it is RECOMMENDED to house this data in the `phenotype/` directory
+as a TSV called `demographics.tsv` and its corresponding data dictionary JSON
+called `demographics.json`.
+
+### 6. Store longitudinal age in the demographics file
+
+It is RECOMMENDED to use the `age` column to record participant age
+at every session in longitudinal or multi-session data sets.
+This reduces data duplication across tabular data files. The `Units` of `age`
+do not have to be years so long as the units of the age
+are written in `phenotype/demographics.json`.
+Consider participant privacy or study objectives when selecting
+the `Units` of `age` or the accuracy of `age` data.
+
+### 7. Use the sessions file at the root level
+
+If there is more than one session for any one participant, then
+it is REQUIRED to provide a sessions file at the dataset root.
+The sessions file MUST list all sessions for all subjects across
+imaging and tabular phenotypic data.
+
+When a sessions file is in use, you MUST NOT provide additional sessions files
+at the participant-level which would otherwise use the inheritance principle.
+If a sessions file is provided, then it MUST begin with a `participant_id` column
+followed immediately by a `session_id` column. The data dictionary JSON file’s
+`session_id` field MUST include `Levels` with the description of each `session_id`.
+
+### 8. Record acquisition time of sessions with `acq_time`
+
+Whenever possible, it is RECOMMENDED to also collect acquisition time for
+tabular phenotypic data and store the time of acquisition[^2] of each row
+inside a column named `acq_time` in the sessions file.
+This is consistent with how acquisition time is recorded for MRI data
+and other time-sensitive measurements (e.g. systolic blood pressure).
+
+When needed to preserve participant privacy, you SHOULD record
+relative acquisition times with respect to the earliest session.
+Relative session acquisition times MAY be listed as durations from
+the earliest session (baseline) in days, months, or years
+using the `acq_time` column.
+
+## Summary
+
+This appendix described seven guidelines for best tabular phenotypic data.
+A short summary table here describes when to use which files.
+
+| File                           | Single session data | Multiple session data |
+| :----------------------------- | :------------------ | :-------------------- |
+| Participants                   | RECOMMENDED         | RECOMMENDED           |
+| Phenotypic and assessment data | RECOMMENDED         | RECOMMENDED           |
+| Sessions                       | OPTIONAL            | REQUIRED              |
+| Demographics                   | OPTIONAL            | RECOMMENDED           |
+
+## Examples
+
+What follows are a few common use case examples for tabular phenotypic files.
+
+### 1 participant session with both non-tabular and tabular phenotypic data
+
+File tree
+
+```Text
+phenotype/
+    <measurement_tool_name>.json
+    <measurement_tool_name>.tsv
+sub-01/anat/
+    sub-01_T1w.json
+    sub-01_T1w.nii.gz
+```
+
+Contents of `phenotype/<measurement_tool_name>.tsv`
+
+```Text
+participant_id measurement_1 measurement_2
+sub-01 value1 value2
+```
+
+### 1 participant with 2 sessions, where 1 session is only tabular phenotype and the other is only imaging
+
+With only one imaging and one phenotypic session each in this example you might want
+to merge both imaging and phenotypic data under one session. But it is more correct to
+have separate sessions for the imaging and phenotypic data, especially if
+the sessions were collected days, weeks, or months apart. You can denote both sessions
+and their acquisition time in the `sessions.tsv` file and have `session_id` `Levels` noted
+in the `sessions.json` sidecar. Below are a CORRECT and an INCORRECT example
+of prepared data following these guidelines.
+
+#### CORRECT
+
+File tree
+
+```Text
+phenotype/
+    <measurement_tool_name>.json
+    <measurement_tool_name>.tsv
+sub-01/ses-MRI/anat/
+    sub-01_ses-MRI_T1w.json
+    sub-01_ses-MRI_T1w.nii.gz
+```
+
+Contents of `phenotype/<measurement_tool_name>.tsv`
+
+```Text
+participant_id session_id measurement_1 measurement_2
+sub-01         ses-pheno  value1        value2
+```
+
+#### INCORRECT
+
+File tree
+
+```Text
+phenotype/
+    <measurement_tool_name>.json
+    <measurement_tool_name>.tsv
+sub-01/anat/
+    sub-01_T1w.json
+    sub-01_T1w.nii.gz
+```
+
+Contents of `phenotype/<measurement_tool_name>.tsv`
+
+```Text
+participant_id measurement_1 measurement_2
+sub-01         value1        value2
+```
+
+A session directory **MUST** be present in the participant directory and
+the `session_id` column **MUST** be present in `<measurement_tool_name>.tsv` as well.
+Sessions must be used consistently for the combination of tabular and
+non-tabular phenotypic data.
+
+### 2 participants with a mix of tabular phenotypic data and imaging sessions
+
+File tree
+
+```Text
+phenotype/
+    <measurement_tool_name>.json
+    <measurement_tool_name>.tsv
+sub-01/
+    ses-MRI1/
+        anat/
+            sub-01_ses-MRI1_T1w.json
+            sub-01_ses-MRI1_T1w.nii.gz
+    ses-MRI2/
+        anat/
+            sub-01_ses-MRI2_T1w.json
+            sub-01_ses-MRI2_T1w.nii.gz
+sub-02/
+    ses-MRI1/
+        anat/
+            sub-02_ses-MRI1_T1w.json
+            sub-02_ses-MRI1_T1w.nii.gz
+```
+
+Contents of `phenotype/<measurement_tool_name>.tsv`
+
+```Text
+participant_id session_id measurement_1 measurement_2
+sub-01         ses-pheno1 value1        value2
+sub-02         ses-pheno1 value3        value4
+sub-02         ses-pheno2 value5        value6
+```
+
+### 3 participants with 3 different kinds of sessions among them
+
+The `ses-baseline` session collects an MRI and tabular phenotypic data.
+
+File tree
+
+```Text
+participants.json
+participants.tsv
+sessions.json
+sessions.tsv
+phenotype/
+    demographics.json
+    demographics.tsv
+    ...
+sub-01/
+    ses-baseline/
+    ses-followupMRI/
+sub-02/
+    ses-baseline/
+sub-03/
+    ses-baseline/
+    ses-followupMRI/
+```
+
+Contents of `sessions.tsv`.
+
+```Text
+participant_id session_id      acq_time
+sub-01         ses-baseline    2001-01-01T12:05:00
+sub-01         ses-followupMRI 2001-07-01T13:33:00
+sub-01         ses-interview   2002-01-01T11:21:00
+sub-02         ses-baseline    2001-04-01T11:01:00
+sub-02         ses-interview   2002-04-01T14:08:00
+sub-03         ses-baseline    2001-09-01T11:45:00
+sub-03         ses-followupMRI 2002-03-01T12:17:00
+```
+
+Contents of `sessions.json`. Note how the `session_id` `Levels` are clearly described.
+
+```json
+{
+    "participant_id": {
+        "Description": "BIDS participant identifier"
+    },
+    "session_id": {
+        "Description": "BIDS session identifier",
+        "Levels": {
+            "ses-baseline": "Baseline visit for MRI and assessments",
+            "ses-followupMRI": "6-months after baseline MRI follow-up",
+            "ses-interview": "1-year after baseline in-person follow-up"
+        }
+    },
+    "acq_time": {
+        "Description": "When the data acquisition started"
+    }
+}
+```
+
+Contents of `participants.tsv`.
+
+```Text
+participant_id sex
+sub-01         M
+sub-02         F
+sub-03         F
+```
+
+Contents of `phenotype/demographics.tsv`. Measures or features that can change
+from session to session belong here especially.
+
+```Text
+participant_id session_id      age gender race household_income
+sub-01         ses-baseline    10  3      4    5
+sub-01         ses-followupMRI 10  3      4    5
+sub-01         ses-interview   11  4      4    6
+sub-02         ses-baseline    9   1      3    3
+sub-02         ses-interview   10  1      7    3
+sub-03         ses-baseline    11  2      10   4
+sub-03         ses-followupMRI 12  5      10   4
+```
+
+For more complete examples, see the `pheno00*`
+[bids-examples on GitHub](https://github.com/bids-standard/bids-examples/).
+
+[^1]: A session is any logical grouping of imaging and behavioral data consistent
+across participants. Session can (but doesn't have to) be synonymous to a visit
+in a longitudinal study. In situations where different data types are obtained over
+several visits (for example fMRI on one day followed by DWI the day after)
+those can still be grouped in one session. Refer to the
+[definition of session](../glossary.md#session-entities) for more details.
+
+[^2]: Datetime format and the anonymization procedure are
+described in [Units](../common-principles.md#units).
diff --git a/src/common-principles.md b/src/common-principles.md
index eeb4170244..3b5a6dbd36 100644
--- a/src/common-principles.md
+++ b/src/common-principles.md
@@ -470,7 +470,7 @@ NIfTI header.
 
 ### Tabular files
 
-Tabular data MUST be saved as plain-text, tab-delimited values (TSV) files
+Tabular data MUST be saved as plain-text, tab-separated values (TSV) files
 (with [extension `.tsv`](glossary.md#tsv-extensions)),
 that is, [CSV files](https://en.wikipedia.org/wiki/Comma-separated_values) where commas are replaced by tab characters.
 Tabs MUST be true tab characters and MUST NOT be a series of space characters.
@@ -532,6 +532,12 @@ Note that if a field name included in the data dictionary matches a column name
 then that field MUST contain a description of the corresponding column,
 using an object containing the following fields:
 
+!!! success "Guideline 1"
+
+    For [best tabular phenotypic data](./appendices/phenotype.md):
+    Each tabular phenotypic data TSV file MUST be accompanied by
+    a corresponding data dictionary JSON file.
+
 <!-- This block generates a metadata table.
 The definitions of these fields can be found in
   src/schema/objects/metadata.yaml
diff --git a/src/modality-agnostic-files/data-summary-files.md b/src/modality-agnostic-files/data-summary-files.md
index d6b5032c2b..fc02a5cbec 100644
--- a/src/modality-agnostic-files/data-summary-files.md
+++ b/src/modality-agnostic-files/data-summary-files.md
@@ -32,11 +32,11 @@ available").
 
 `participants.tsv` example:
 
-```tsv
-participant_id	age	sex	handedness	group
-sub-01	34	M	right	read
-sub-02	12	F	right	write
-sub-03	33	F	n/a	read
+```Text
+participant_id age sex handedness group
+sub-01         34  M   right      read
+sub-02         12  F   right      write
+sub-03         33  F   n/a        read
 ```
 
 It is RECOMMENDED to accompany each `participants.tsv` file with a sidecar
@@ -106,13 +106,13 @@ and a guide for using macros can be found at
 
 `samples.tsv` example:
 
-```tsv
-sample_id	participant_id	sample_type	derived_from
-sample-01	sub-01	tissue	n/a
-sample-02	sub-01	tissue	sample-01
-sample-03	sub-01	tissue	sample-01
-sample-04	sub-02	tissue	n/a
-sample-05	sub-02	tissue	n/a
+```Text
+sample_id participant_id sample_type derived_from
+sample-01 sub-01         tissue      n/a
+sample-02 sub-01         tissue      sample-01
+sample-03 sub-01         tissue      sample-01
+sample-04 sub-02         tissue      n/a
+sample-05 sub-02         tissue      n/a
 ```
 
 It is RECOMMENDED to accompany each `samples.tsv` file with a sidecar
@@ -132,6 +132,177 @@ It is RECOMMENDED to accompany each `samples.tsv` file with a sidecar
 }
 ```
 
+## Phenotypic and assessment data
+
+Template:
+
+```Text
+phenotype/
+    <measurement_tool_name>.tsv
+    <measurement_tool_name>.json
+```
+
+Optional: Yes
+
+If the dataset includes multiple sets of participant level measurements (for
+example responses from multiple questionnaires) they can be split into
+individual files separate from `participants.tsv`.
+
+Each of the measurement files MUST be kept in a `/phenotype` directory placed
+at the root of the BIDS dataset and MUST end with the `.tsv` extension.
+Filenames SHOULD be chosen to reflect the contents of the file.
+For example, the "Adult ADHD Clinical Diagnostic Scale" could be saved in a file
+called `phenotype/acds_adult.tsv`.
+
+The files can include an arbitrary set of columns, but one of them MUST be
+`participant_id` and the entries of that column MUST correspond to the subjects
+in the BIDS dataset and `participants.tsv` file.
+
+!!! success "Guideline 2"
+
+    For [best tabular phenotypic data](../appendices/phenotype.md):
+    It is REQUIRED to aggregate all participant data into
+    one TSV per tabular phenotypic file.
+
+In phenotypic and assessment data each measurement tool has
+an independent aggregated data TSV file in which the user collects
+all subjects, sessions, and/or runs of data as one entry per row
+(with a row defined by the smallest unit of acquisition). In other words:
+
+1. Each row MUST start with `participant_id`.
+2. Each TSV file SHOULD contain a `session_id` column
+when multiple [sessions](../glossary.md#session-entities) are present
+in the data set regardless of whether those sessions are in
+the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
+3. If more than one of the same measurement tool is acquired
+within the same `session_id`, a `run` column SHOULD be added.
+4. To encode the acquisition time for a measurement tool’s `session_id`,
+add the `session_id` to the sessions file
+and include the OPTIONAL `acq_time` column.
+
+!!! success "Guideline 3"
+
+    For [best tabular phenotypic data](../appendices/phenotype.md):
+ 
+    | **Column name**  | **Requirement** | **Description** |
+    | :--------------- | :-------------- | :-------------- |
+    | `participant_id` | REQUIRED        | MUST be the first column in the file. Note that data for one participant MAY be represented across multiple rows in case of multiple sessions or runs, and therefore the entry in the `participant_id` column will be repeated. |
+    | `session _id`    | CONDITIONAL ; If sessions are defined in the dataset | A `session_id` column MUST be added to all tabular files in the phenotype directory as soon as multiple sessions are present in the data set regardless of whether those sessions are in the  `phenotype/` data, `sub-<label>/` data, or a combination of the two. |
+    | `run`            | CONDITIONAL ; If there are multiple runs within any session | A chronological `run` number is used when a measurement tool or assessment described by a tabular file was repeated within a session. |
+    | `acq_time`       | OPTIONAL        | If acquisition time is available, the `acq_time` column CAN be used to record the time of acquisition of each row in the tabular file. |
+    
+    Furthermore, if you have to add a `session_id` column to the tabular phenotypic data, you then MUST also introduce a session directory to the imaging data, even if only one imaging session has been created. This rule can be considered as "**if anyone uses sessions, everyone uses sessions**." And vice versa, if imaging data has session directories, all imaging data and tabular phenotypic data MUST have sessions.
+
+    This produces a file in which same-participant entries can take up as many rows as needed according to the smallest unit of acquisition. The combination of values in the `participant_id`, `session_id`, and `run` (if present) columns MUST be unique for the entire tabular file.
+
+As with all other tabular data, the additional phenotypic information files
+MAY be accompanied by a JSON file describing the columns in detail
+(see [Tabular files](../common-principles.md#tabular-files)).
+
+!!! success "Guideline 1"
+
+    For [best tabular phenotypic data](../appendices/phenotype.md):
+    Each tabular phenotypic data TSV file MUST be accompanied by
+    a corresponding data dictionary JSON file.
+
+In addition to the column descriptions, the JSON file MAY contain the following fields:
+
+<!-- This block generates a metadata table.
+The definitions of these fields can be found in
+  src/schema/objects/metadata.yaml
+and a guide for using macros can be found at
+ https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
+-->
+{{ MACROS___make_metadata_table(
+   {
+      "MeasurementToolMetadata": "OPTIONAL",
+      "Derivative": "OPTIONAL",
+   }
+) }}
+
+As an example, consider the contents of a file called
+`phenotype/acds_adult.json`:
+
+```JSON
+{
+  "MeasurementToolMetadata": {
+    "Description": "Adult ADHD Clinical Diagnostic Scale V1.2",
+    "TermURL": "https://www.cognitiveatlas.org/task/id/trm_5586ff878155d"
+  },
+  "adhd_b": {
+    "Description": "B. CHILDHOOD ONSET OF ADHD (PRIOR TO AGE 7)",
+    "Levels": {
+      "1": "YES",
+      "2": "NO"
+    }
+  },
+  "adhd_c_dx": {
+    "Description": "As child met A, B, C, D, E and F diagnostic criteria",
+    "Levels": {
+      "1": "YES",
+      "2": "NO"
+    }
+  }
+}
+```
+
+Please note that in this example `MeasurementToolMetadata` includes information
+about the questionnaire and `adhd_b` and `adhd_c_dx` correspond to individual
+columns.
+
+!!! success "Guideline 4"
+
+    For [best tabular phenotypic data](../appendices/phenotype.md):
+    Whenever possible, it is RECOMMENDED to add `MeasurementToolMetadata` to
+    each `phenotype/<measurement_tool_name>.json` data dictionary.
+    This improves reusability and provides clarity about the measurement tool.
+
+
+In addition to the keys available to describe columns in all tabular files
+(`LongName`, `Description`, `Levels`, `Units`, and `TermURL`) the
+`participants.json` file as well as phenotypic files can also include column
+descriptions with a `Derivative` field that, when set to true, indicates that
+values in the corresponding column is a transformation of values from other
+columns (for example a summary score based on a subset of items in a
+questionnaire).
+
+## Demographics file
+
+Template:
+
+```Text
+phenotype/
+    demographics.tsv
+    demographics.json
+```
+
+The demographics file is an OPTIONAL tabular phenotypic file in
+the `phenotype/` directory meant to house common subject demographics.
+For example demographics like age, gender, race, and household income.
+A demographics file is RECOMMENDED to use when any participant has
+more than one session of any type.
+It does not replace the participants file, which is meant for unchanging data about
+each participant in the data set. It is instead a superset of the participants file,
+centralizing demographics across as many sessions as are available.
+
+!!! success "Guideline 5"
+
+    For [best tabular phenotypic data](../appendices/phenotype.md):
+    Some studies collect demographics into their own
+    tabular phenotypic data file already. In these cases, it is RECOMMENDED
+    to house this data also in the demographics file.
+
+!!! success "Guideline 6"
+
+    For [best tabular phenotypic data](../appendices/phenotype.md):
+    It is RECOMMENDED to use the `age` column to record participant age
+    at every session in longitudinal or multi-session data sets.
+    This reduces data duplication across tabular data files. The `Units` of `age`
+    do not have to be years so long as the units of the age
+    are written in `phenotype/demographics.json`.
+    Consider participant privacy or study objectives when selecting
+    the `Units` of `age` or the accuracy of `age` data.
+
 ## Scans file
 
 Template:
@@ -189,19 +360,20 @@ All such included additional fields SHOULD be documented in an accompanying
 
 Example `_scans.tsv`:
 
-```tsv
-filename	acq_time
-func/sub-control01_task-nback_bold.nii.gz	1877-06-15T13:45:30
-func/sub-control01_task-motor_bold.nii.gz	1877-06-15T13:55:33
-meg/sub-control01_task-rest_split-01_meg.nii.gz	1877-06-15T12:15:27
-meg/sub-control01_task-rest_split-02_meg.nii.gz	1877-06-15T12:15:27
+```Text
+filename                                        acq_time
+func/sub-control01_task-nback_bold.nii.gz       1877-06-15T13:45:30
+func/sub-control01_task-motor_bold.nii.gz       1877-06-15T13:55:33
+meg/sub-control01_task-rest_split-01_meg.nii.gz 1877-06-15T12:15:27
+meg/sub-control01_task-rest_split-02_meg.nii.gz 1877-06-15T12:15:27
 ```
 
 ## Sessions file
 
-Template:
+Template A (segregated sessions files):
 
 ```Text
+[sessions.json]
 sub-<label>/
     sub-<label>_sessions.tsv
 ```
@@ -209,11 +381,11 @@ sub-<label>/
 Optional: Yes
 
 In case of multiple sessions there is an option of adding additional
-`sessions.tsv` files describing variables changing between sessions.
+`sessions.tsv` files describing each session and variables changing between sessions.
 In such case one file per participant SHOULD be added.
 These files MUST include a `session_id` column and describe each session by one and only one row.
 Column names in `sessions.tsv` files MUST be different from group level participant key column names in the
-[`participants.tsv` file](./data-summary-files.md#participants-file).
+[`participants.tsv` file](#participants-file).
 
 <!-- This block generates a columns table.
 The definitions of these fields can be found in
@@ -223,11 +395,71 @@ and a guide for using macros can be found at
 -->
 {{ MACROS___make_columns_table("modality_agnostic.Sessions") }}
 
-`_sessions.tsv` example:
+`sub-<label>/sub-<label>_sessions.tsv` example:
 
-```tsv
-session_id	acq_time	systolic_blood_pressure
-ses-predrug	2009-06-15T13:45:30	120
-ses-postdrug	2009-06-16T13:45:30	100
-ses-followup	2009-06-17T13:45:30	110
+```Text
+session_id   acq_time            systolic_blood_pressure
+ses-predrug  2009-06-15T13:45:30 120
+ses-postdrug 2009-06-16T13:45:30 100
+ses-followup 2009-06-17T13:45:30 110
 ```
+
+Template B (aggregated sessions file):
+
+```Text
+sessions.tsv
+sessions.json
+```
+
+Optional: Yes
+
+An aggregated sessions file CAN be provided at the dataset root.
+If a root-level sessions file is provided, then it MUST begin with
+a `participant_id` column followed immediately after by a `session_id` column.
+The intent of this root-level sessions file is to describe the sessions
+in a data set and non-demographic variables changing between sessions.
+Participant's demographic variables should be added to
+a [demographics file](#demographics-file), as described above.
+
+`sessions.tsv` example:
+
+```Text
+participant_id session_id   acq_time            systolic_blood_pressure
+sub-01         ses-predrug  2009-06-15T13:45:30 120
+sub-01         ses-postdrug 2009-06-16T13:45:30 100
+sub-01         ses-followup 2009-06-17T13:45:30 110
+sub-02         ses-predrug  2009-06-22T12:22:05 105
+sub-02         ses-postdrug 2009-06-23T12:22:05 95
+sub-03         ses-postdrug 2009-06-30T14:06:40 115
+sub-03         ses-followup 2009-07-01T14:06:40 120
+```
+
+!!! success "Guideline 7"
+
+    For [best tabular phenotypic data](../appendices/phenotype.md):
+    If there is more than one session for any one participant, then it is
+    REQUIRED to provide a sessions file at the dataset root.
+    The sessions file MUST list all sessions for all subjects
+    across imaging and tabular phenotypic data.
+
+    When a sessions file is in use, you MUST NOT provide additional sessions
+    files at the participant-level which would otherwise use
+    the inheritance principle. If a sessions file is provided, then
+    it MUST begin with a `participant_id` column followed immediately by
+    a `session_id` column. The data dictionary JSON file's `session_id` field
+    MUST include `Levels` with the description of each `session_id`.
+
+!!! success "Guideline 8"
+
+    For [best tabular phenotypic data](../appendices/phenotype.md):
+    Whenever possible, it is RECOMMENDED to also collect acquisition time
+    for tabular phenotypic data and store the time of acquisition of each row
+    inside a column named `acq_time` in the sessions file.
+    This is consistent with how acquisition time is recorded for MRI data
+    and other time-sensitive measurements (e.g. systolic blood pressure).
+
+    When it is needed to preserve participant privacy, you SHOULD record
+    relative acquisition times with respect to the earliest session.
+    Relative session acquisition times MAY be listed as durations from
+    the earliest session (baseline) in days, months, or years
+    using the `acq_time` column.
diff --git a/src/schema/objects/files.yaml b/src/schema/objects/files.yaml
index b16a8c9c1d..ff09767b45 100644
--- a/src/schema/objects/files.yaml
+++ b/src/schema/objects/files.yaml
@@ -68,12 +68,15 @@ participants:
   display_name: Participant Information
   file_type: regular
   description: |
-    The purpose of this RECOMMENDED file is to describe properties of participants
-    such as age, sex, handedness, species and strain.
+    The purpose of this RECOMMENDED file is to describe unchanging properties of participants
+    such as sex, species, and strain.
     If this file exists, it MUST contain the column `participant_id`,
     which MUST consist of `sub-<label>` values identifying one row for each participant,
     followed by a list of optional columns describing participants.
     Each participant MUST be described by one and only one row.
+    For participants with multiple sessions, see
+    the [sessions file](SPEC_ROOT/modality-agnostic-files.md#sessions-file)
+    and [demographics file](SPEC_ROOT/modality-agnostic-files.md#demographics-file) sections.
 
     The `participant_id` entries MUST be a superset of all subject directories
     and all `participant_id` entries found among phenotypic and assessment data

From 0a640e610392c50799848ee77b27551f8a50ee45 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Fri, 30 May 2025 06:55:48 -0700
Subject: [PATCH 02/53] Update phenotype.md and data-summary-files.md

Changed "e.g." to "for example" to follow contributing style guidelines.
---
 src/appendices/phenotype.md                       | 2 +-
 src/modality-agnostic-files/data-summary-files.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index 53afb47206..1d18991514 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -103,7 +103,7 @@ Whenever possible, it is RECOMMENDED to also collect acquisition time for
 tabular phenotypic data and store the time of acquisition[^2] of each row
 inside a column named `acq_time` in the sessions file.
 This is consistent with how acquisition time is recorded for MRI data
-and other time-sensitive measurements (e.g. systolic blood pressure).
+and other time-sensitive measurements (for example systolic blood pressure).
 
 When needed to preserve participant privacy, you SHOULD record
 relative acquisition times with respect to the earliest session.
diff --git a/src/modality-agnostic-files/data-summary-files.md b/src/modality-agnostic-files/data-summary-files.md
index fc02a5cbec..b14bd7ac47 100644
--- a/src/modality-agnostic-files/data-summary-files.md
+++ b/src/modality-agnostic-files/data-summary-files.md
@@ -456,7 +456,7 @@ sub-03         ses-followup 2009-07-01T14:06:40 120
     for tabular phenotypic data and store the time of acquisition of each row
     inside a column named `acq_time` in the sessions file.
     This is consistent with how acquisition time is recorded for MRI data
-    and other time-sensitive measurements (e.g. systolic blood pressure).
+    and other time-sensitive measurements (for example systolic blood pressure).
 
     When it is needed to preserve participant privacy, you SHOULD record
     relative acquisition times with respect to the earliest session.

From a19512bea49e6c2aa0ea94722701bac5def97350 Mon Sep 17 00:00:00 2001
From: "pre-commit-ci[bot]"
 <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Date: Fri, 30 May 2025 14:11:02 +0000
Subject: [PATCH 03/53] [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci
---
 src/modality-agnostic-files/data-summary-files.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/modality-agnostic-files/data-summary-files.md b/src/modality-agnostic-files/data-summary-files.md
index b14bd7ac47..6befc4bbe6 100644
--- a/src/modality-agnostic-files/data-summary-files.md
+++ b/src/modality-agnostic-files/data-summary-files.md
@@ -183,14 +183,14 @@ and include the OPTIONAL `acq_time` column.
 !!! success "Guideline 3"
 
     For [best tabular phenotypic data](../appendices/phenotype.md):
- 
+
     | **Column name**  | **Requirement** | **Description** |
     | :--------------- | :-------------- | :-------------- |
     | `participant_id` | REQUIRED        | MUST be the first column in the file. Note that data for one participant MAY be represented across multiple rows in case of multiple sessions or runs, and therefore the entry in the `participant_id` column will be repeated. |
     | `session _id`    | CONDITIONAL ; If sessions are defined in the dataset | A `session_id` column MUST be added to all tabular files in the phenotype directory as soon as multiple sessions are present in the data set regardless of whether those sessions are in the  `phenotype/` data, `sub-<label>/` data, or a combination of the two. |
     | `run`            | CONDITIONAL ; If there are multiple runs within any session | A chronological `run` number is used when a measurement tool or assessment described by a tabular file was repeated within a session. |
     | `acq_time`       | OPTIONAL        | If acquisition time is available, the `acq_time` column CAN be used to record the time of acquisition of each row in the tabular file. |
-    
+
     Furthermore, if you have to add a `session_id` column to the tabular phenotypic data, you then MUST also introduce a session directory to the imaging data, even if only one imaging session has been created. This rule can be considered as "**if anyone uses sessions, everyone uses sessions**." And vice versa, if imaging data has session directories, all imaging data and tabular phenotypic data MUST have sessions.
 
     This produces a file in which same-participant entries can take up as many rows as needed according to the smallest unit of acquisition. The combination of values in the `participant_id`, `session_id`, and `run` (if present) columns MUST be unique for the entire tabular file.

From 5718888b291e7451e636b801ce05b53a5d49fc80 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Fri, 30 May 2025 08:15:24 -0700
Subject: [PATCH 04/53] Update data-summary-files.md and
 phenotypic-and-assessment-data.md

Put the phenotypic and assessment data content where it belongs.
---
 .../data-summary-files.md                     | 134 ------------------
 .../phenotypic-and-assessment-data.md         |  55 ++++++-
 2 files changed, 53 insertions(+), 136 deletions(-)

diff --git a/src/modality-agnostic-files/data-summary-files.md b/src/modality-agnostic-files/data-summary-files.md
index 6befc4bbe6..d729529e45 100644
--- a/src/modality-agnostic-files/data-summary-files.md
+++ b/src/modality-agnostic-files/data-summary-files.md
@@ -132,140 +132,6 @@ It is RECOMMENDED to accompany each `samples.tsv` file with a sidecar
 }
 ```
 
-## Phenotypic and assessment data
-
-Template:
-
-```Text
-phenotype/
-    <measurement_tool_name>.tsv
-    <measurement_tool_name>.json
-```
-
-Optional: Yes
-
-If the dataset includes multiple sets of participant level measurements (for
-example responses from multiple questionnaires) they can be split into
-individual files separate from `participants.tsv`.
-
-Each of the measurement files MUST be kept in a `/phenotype` directory placed
-at the root of the BIDS dataset and MUST end with the `.tsv` extension.
-Filenames SHOULD be chosen to reflect the contents of the file.
-For example, the "Adult ADHD Clinical Diagnostic Scale" could be saved in a file
-called `phenotype/acds_adult.tsv`.
-
-The files can include an arbitrary set of columns, but one of them MUST be
-`participant_id` and the entries of that column MUST correspond to the subjects
-in the BIDS dataset and `participants.tsv` file.
-
-!!! success "Guideline 2"
-
-    For [best tabular phenotypic data](../appendices/phenotype.md):
-    It is REQUIRED to aggregate all participant data into
-    one TSV per tabular phenotypic file.
-
-In phenotypic and assessment data each measurement tool has
-an independent aggregated data TSV file in which the user collects
-all subjects, sessions, and/or runs of data as one entry per row
-(with a row defined by the smallest unit of acquisition). In other words:
-
-1. Each row MUST start with `participant_id`.
-2. Each TSV file SHOULD contain a `session_id` column
-when multiple [sessions](../glossary.md#session-entities) are present
-in the data set regardless of whether those sessions are in
-the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
-3. If more than one of the same measurement tool is acquired
-within the same `session_id`, a `run` column SHOULD be added.
-4. To encode the acquisition time for a measurement tool’s `session_id`,
-add the `session_id` to the sessions file
-and include the OPTIONAL `acq_time` column.
-
-!!! success "Guideline 3"
-
-    For [best tabular phenotypic data](../appendices/phenotype.md):
-
-    | **Column name**  | **Requirement** | **Description** |
-    | :--------------- | :-------------- | :-------------- |
-    | `participant_id` | REQUIRED        | MUST be the first column in the file. Note that data for one participant MAY be represented across multiple rows in case of multiple sessions or runs, and therefore the entry in the `participant_id` column will be repeated. |
-    | `session _id`    | CONDITIONAL ; If sessions are defined in the dataset | A `session_id` column MUST be added to all tabular files in the phenotype directory as soon as multiple sessions are present in the data set regardless of whether those sessions are in the  `phenotype/` data, `sub-<label>/` data, or a combination of the two. |
-    | `run`            | CONDITIONAL ; If there are multiple runs within any session | A chronological `run` number is used when a measurement tool or assessment described by a tabular file was repeated within a session. |
-    | `acq_time`       | OPTIONAL        | If acquisition time is available, the `acq_time` column CAN be used to record the time of acquisition of each row in the tabular file. |
-
-    Furthermore, if you have to add a `session_id` column to the tabular phenotypic data, you then MUST also introduce a session directory to the imaging data, even if only one imaging session has been created. This rule can be considered as "**if anyone uses sessions, everyone uses sessions**." And vice versa, if imaging data has session directories, all imaging data and tabular phenotypic data MUST have sessions.
-
-    This produces a file in which same-participant entries can take up as many rows as needed according to the smallest unit of acquisition. The combination of values in the `participant_id`, `session_id`, and `run` (if present) columns MUST be unique for the entire tabular file.
-
-As with all other tabular data, the additional phenotypic information files
-MAY be accompanied by a JSON file describing the columns in detail
-(see [Tabular files](../common-principles.md#tabular-files)).
-
-!!! success "Guideline 1"
-
-    For [best tabular phenotypic data](../appendices/phenotype.md):
-    Each tabular phenotypic data TSV file MUST be accompanied by
-    a corresponding data dictionary JSON file.
-
-In addition to the column descriptions, the JSON file MAY contain the following fields:
-
-<!-- This block generates a metadata table.
-The definitions of these fields can be found in
-  src/schema/objects/metadata.yaml
-and a guide for using macros can be found at
- https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
--->
-{{ MACROS___make_metadata_table(
-   {
-      "MeasurementToolMetadata": "OPTIONAL",
-      "Derivative": "OPTIONAL",
-   }
-) }}
-
-As an example, consider the contents of a file called
-`phenotype/acds_adult.json`:
-
-```JSON
-{
-  "MeasurementToolMetadata": {
-    "Description": "Adult ADHD Clinical Diagnostic Scale V1.2",
-    "TermURL": "https://www.cognitiveatlas.org/task/id/trm_5586ff878155d"
-  },
-  "adhd_b": {
-    "Description": "B. CHILDHOOD ONSET OF ADHD (PRIOR TO AGE 7)",
-    "Levels": {
-      "1": "YES",
-      "2": "NO"
-    }
-  },
-  "adhd_c_dx": {
-    "Description": "As child met A, B, C, D, E and F diagnostic criteria",
-    "Levels": {
-      "1": "YES",
-      "2": "NO"
-    }
-  }
-}
-```
-
-Please note that in this example `MeasurementToolMetadata` includes information
-about the questionnaire and `adhd_b` and `adhd_c_dx` correspond to individual
-columns.
-
-!!! success "Guideline 4"
-
-    For [best tabular phenotypic data](../appendices/phenotype.md):
-    Whenever possible, it is RECOMMENDED to add `MeasurementToolMetadata` to
-    each `phenotype/<measurement_tool_name>.json` data dictionary.
-    This improves reusability and provides clarity about the measurement tool.
-
-
-In addition to the keys available to describe columns in all tabular files
-(`LongName`, `Description`, `Levels`, `Units`, and `TermURL`) the
-`participants.json` file as well as phenotypic files can also include column
-descriptions with a `Derivative` field that, when set to true, indicates that
-values in the corresponding column is a transformation of values from other
-columns (for example a summary score based on a subset of items in a
-questionnaire).
-
 ## Demographics file
 
 Template:
diff --git a/src/modality-agnostic-files/phenotypic-and-assessment-data.md b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
index d876a38ee7..fb03553d5a 100644
--- a/src/modality-agnostic-files/phenotypic-and-assessment-data.md
+++ b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
@@ -1,4 +1,4 @@
-# Phenotypic and assessment data
+## Phenotypic and assessment data
 
 Template:
 
@@ -18,16 +18,59 @@ Each of the measurement files MUST be kept in a `/phenotype` directory placed
 at the root of the BIDS dataset and MUST end with the `.tsv` extension.
 Filenames SHOULD be chosen to reflect the contents of the file.
 For example, the "Adult ADHD Clinical Diagnostic Scale" could be saved in a file
-called `/phenotype/acds_adult.tsv`.
+called `phenotype/acds_adult.tsv`.
 
 The files can include an arbitrary set of columns, but one of them MUST be
 `participant_id` and the entries of that column MUST correspond to the subjects
 in the BIDS dataset and `participants.tsv` file.
 
+!!! success "Guideline 2"
+
+    For [best tabular phenotypic data](../appendices/phenotype.md):
+    It is REQUIRED to aggregate all participant data into
+    one TSV per tabular phenotypic file.
+
+In phenotypic and assessment data each measurement tool has
+an independent aggregated data TSV file in which the user collects
+all subjects, sessions, and/or runs of data as one entry per row
+(with a row defined by the smallest unit of acquisition). In other words:
+
+1. Each row MUST start with `participant_id`.
+2. Each TSV file SHOULD contain a `session_id` column
+when multiple [sessions](../glossary.md#session-entities) are present
+in the data set regardless of whether those sessions are in
+the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
+3. If more than one of the same measurement tool is acquired
+within the same `session_id`, a `run` column SHOULD be added.
+4. To encode the acquisition time for a measurement tool’s `session_id`,
+add the `session_id` to the sessions file
+and include the OPTIONAL `acq_time` column.
+
+!!! success "Guideline 3"
+
+    For [best tabular phenotypic data](../appendices/phenotype.md):
+
+    | **Column name**  | **Requirement** | **Description** |
+    | :--------------- | :-------------- | :-------------- |
+    | `participant_id` | REQUIRED        | MUST be the first column in the file. Note that data for one participant MAY be represented across multiple rows in case of multiple sessions or runs, and therefore the entry in the `participant_id` column will be repeated. |
+    | `session _id`    | CONDITIONAL ; If sessions are defined in the dataset | A `session_id` column MUST be added to all tabular files in the phenotype directory as soon as multiple sessions are present in the data set regardless of whether those sessions are in the  `phenotype/` data, `sub-<label>/` data, or a combination of the two. |
+    | `run`            | CONDITIONAL ; If there are multiple runs within any session | A chronological `run` number is used when a measurement tool or assessment described by a tabular file was repeated within a session. |
+    | `acq_time`       | OPTIONAL        | If acquisition time is available, the `acq_time` column CAN be used to record the time of acquisition of each row in the tabular file. |
+
+    Furthermore, if you have to add a `session_id` column to the tabular phenotypic data, you then MUST also introduce a session directory to the imaging data, even if only one imaging session has been created. This rule can be considered as "**if anyone uses sessions, everyone uses sessions**." And vice versa, if imaging data has session directories, all imaging data and tabular phenotypic data MUST have sessions.
+
+    This produces a file in which same-participant entries can take up as many rows as needed according to the smallest unit of acquisition. The combination of values in the `participant_id`, `session_id`, and `run` (if present) columns MUST be unique for the entire tabular file.
+
 As with all other tabular data, the additional phenotypic information files
 MAY be accompanied by a JSON file describing the columns in detail
 (see [Tabular files](../common-principles.md#tabular-files)).
 
+!!! success "Guideline 1"
+
+    For [best tabular phenotypic data](../appendices/phenotype.md):
+    Each tabular phenotypic data TSV file MUST be accompanied by
+    a corresponding data dictionary JSON file.
+
 In addition to the column descriptions, the JSON file MAY contain the following fields:
 
 <!-- This block generates a metadata table.
@@ -73,6 +116,14 @@ Please note that in this example `MeasurementToolMetadata` includes information
 about the questionnaire and `adhd_b` and `adhd_c_dx` correspond to individual
 columns.
 
+!!! success "Guideline 4"
+
+    For [best tabular phenotypic data](../appendices/phenotype.md):
+    Whenever possible, it is RECOMMENDED to add `MeasurementToolMetadata` to
+    each `phenotype/<measurement_tool_name>.json` data dictionary.
+    This improves reusability and provides clarity about the measurement tool.
+
+
 In addition to the keys available to describe columns in all tabular files
 (`LongName`, `Description`, `Levels`, `Units`, and `TermURL`) the
 `participants.json` file as well as phenotypic files can also include column

From 8f54e94858462abf1c7778cec6333956840e7bf4 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Fri, 30 May 2025 08:39:14 -0700
Subject: [PATCH 05/53] Apply suggestions from code review

Fix `Text` examples to become `tsv` examples with correct tab delimiters.
---
 src/appendices/phenotype.md                   | 74 +++++++++----------
 .../data-summary-files.md                     | 64 ++++++++--------
 2 files changed, 69 insertions(+), 69 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index 1d18991514..72a5b84eab 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -142,9 +142,9 @@ sub-01/anat/
 
 Contents of `phenotype/<measurement_tool_name>.tsv`
 
-```Text
-participant_id measurement_1 measurement_2
-sub-01 value1 value2
+```tsv
+participant_id	measurement_1	measurement_2
+sub-01	value1	value2
 ```
 
 ### 1 participant with 2 sessions, where 1 session is only tabular phenotype and the other is only imaging
@@ -172,9 +172,9 @@ sub-01/ses-MRI/anat/
 
 Contents of `phenotype/<measurement_tool_name>.tsv`
 
-```Text
-participant_id session_id measurement_1 measurement_2
-sub-01         ses-pheno  value1        value2
+```tsv
+participant_id	session_id	measurement_1	measurement_2
+sub-01	ses-pheno	value1	value2
 ```
 
 #### INCORRECT
@@ -192,9 +192,9 @@ sub-01/anat/
 
 Contents of `phenotype/<measurement_tool_name>.tsv`
 
-```Text
-participant_id measurement_1 measurement_2
-sub-01         value1        value2
+```tsv
+participant_id	measurement_1	measurement_2
+sub-01	value1	value2
 ```
 
 A session directory **MUST** be present in the participant directory and
@@ -228,11 +228,11 @@ sub-02/
 
 Contents of `phenotype/<measurement_tool_name>.tsv`
 
-```Text
-participant_id session_id measurement_1 measurement_2
-sub-01         ses-pheno1 value1        value2
-sub-02         ses-pheno1 value3        value4
-sub-02         ses-pheno2 value5        value6
+```tsv
+participant_id	session_id	measurement_1	measurement_2
+sub-01	ses-pheno1	value1	value2
+sub-02	ses-pheno1	value3	value4
+sub-02	ses-pheno2	value5	value6
 ```
 
 ### 3 participants with 3 different kinds of sessions among them
@@ -262,15 +262,15 @@ sub-03/
 
 Contents of `sessions.tsv`.
 
-```Text
-participant_id session_id      acq_time
-sub-01         ses-baseline    2001-01-01T12:05:00
-sub-01         ses-followupMRI 2001-07-01T13:33:00
-sub-01         ses-interview   2002-01-01T11:21:00
-sub-02         ses-baseline    2001-04-01T11:01:00
-sub-02         ses-interview   2002-04-01T14:08:00
-sub-03         ses-baseline    2001-09-01T11:45:00
-sub-03         ses-followupMRI 2002-03-01T12:17:00
+```tsv
+participant_id	session_id	acq_time
+sub-01	ses-baseline	2001-01-01T12:05:00
+sub-01	ses-followupMRI	2001-07-01T13:33:00
+sub-01	ses-interview	2002-01-01T11:21:00
+sub-02	ses-baseline	2001-04-01T11:01:00
+sub-02	ses-interview	2002-04-01T14:08:00
+sub-03	ses-baseline	2001-09-01T11:45:00
+sub-03	ses-followupMRI	2002-03-01T12:17:00
 ```
 
 Contents of `sessions.json`. Note how the `session_id` `Levels` are clearly described.
@@ -296,25 +296,25 @@ Contents of `sessions.json`. Note how the `session_id` `Levels` are clearly desc
 
 Contents of `participants.tsv`.
 
-```Text
-participant_id sex
-sub-01         M
-sub-02         F
-sub-03         F
+```tsv
+participant_id	sex
+sub-01	M
+sub-02	F
+sub-03	F
 ```
 
 Contents of `phenotype/demographics.tsv`. Measures or features that can change
 from session to session belong here especially.
 
-```Text
-participant_id session_id      age gender race household_income
-sub-01         ses-baseline    10  3      4    5
-sub-01         ses-followupMRI 10  3      4    5
-sub-01         ses-interview   11  4      4    6
-sub-02         ses-baseline    9   1      3    3
-sub-02         ses-interview   10  1      7    3
-sub-03         ses-baseline    11  2      10   4
-sub-03         ses-followupMRI 12  5      10   4
+```tsv
+participant_id	session_id	age	gender	race	household_income
+sub-01	ses-baseline	10	3	4	5
+sub-01	ses-followupMRI	10	3	4	5
+sub-01	ses-interview	11	4	4	6
+sub-02	ses-baseline	9	1	3	3
+sub-02	ses-interview	10	1	7	3
+sub-03	ses-baseline	11	2	10	4
+sub-03	ses-followupMRI	12	5	10	4
 ```
 
 For more complete examples, see the `pheno00*`
diff --git a/src/modality-agnostic-files/data-summary-files.md b/src/modality-agnostic-files/data-summary-files.md
index d729529e45..c2b8fe80b2 100644
--- a/src/modality-agnostic-files/data-summary-files.md
+++ b/src/modality-agnostic-files/data-summary-files.md
@@ -32,11 +32,11 @@ available").
 
 `participants.tsv` example:
 
-```Text
-participant_id age sex handedness group
-sub-01         34  M   right      read
-sub-02         12  F   right      write
-sub-03         33  F   n/a        read
+```tsv
+participant_id	age	sex	handedness	group
+sub-01	34	M	right	read
+sub-02	12	F	right	write
+sub-03	33	F	n/a	read
 ```
 
 It is RECOMMENDED to accompany each `participants.tsv` file with a sidecar
@@ -106,13 +106,13 @@ and a guide for using macros can be found at
 
 `samples.tsv` example:
 
-```Text
-sample_id participant_id sample_type derived_from
-sample-01 sub-01         tissue      n/a
-sample-02 sub-01         tissue      sample-01
-sample-03 sub-01         tissue      sample-01
-sample-04 sub-02         tissue      n/a
-sample-05 sub-02         tissue      n/a
+```tsv
+sample_id	participant_id	sample_type	derived_from
+sample-01	sub-01	tissue	n/a
+sample-02	sub-01	tissue	sample-01
+sample-03	sub-01	tissue	sample-01
+sample-04	sub-02	tissue	n/a
+sample-05	sub-02	tissue	n/a
 ```
 
 It is RECOMMENDED to accompany each `samples.tsv` file with a sidecar
@@ -226,12 +226,12 @@ All such included additional fields SHOULD be documented in an accompanying
 
 Example `_scans.tsv`:
 
-```Text
-filename                                        acq_time
-func/sub-control01_task-nback_bold.nii.gz       1877-06-15T13:45:30
-func/sub-control01_task-motor_bold.nii.gz       1877-06-15T13:55:33
-meg/sub-control01_task-rest_split-01_meg.nii.gz 1877-06-15T12:15:27
-meg/sub-control01_task-rest_split-02_meg.nii.gz 1877-06-15T12:15:27
+```tsv
+filename	acq_time
+func/sub-control01_task-nback_bold.nii.gz	1877-06-15T13:45:30
+func/sub-control01_task-motor_bold.nii.gz	1877-06-15T13:55:33
+meg/sub-control01_task-rest_split-01_meg.nii.gz	1877-06-15T12:15:27
+meg/sub-control01_task-rest_split-02_meg.nii.gz	1877-06-15T12:15:27
 ```
 
 ## Sessions file
@@ -263,11 +263,11 @@ and a guide for using macros can be found at
 
 `sub-<label>/sub-<label>_sessions.tsv` example:
 
-```Text
-session_id   acq_time            systolic_blood_pressure
-ses-predrug  2009-06-15T13:45:30 120
-ses-postdrug 2009-06-16T13:45:30 100
-ses-followup 2009-06-17T13:45:30 110
+```tsv
+session_id	acq_time	systolic_blood_pressure
+ses-predrug	2009-06-15T13:45:30	120
+ses-postdrug	2009-06-16T13:45:30	100
+ses-followup	2009-06-17T13:45:30	110
 ```
 
 Template B (aggregated sessions file):
@@ -289,15 +289,15 @@ a [demographics file](#demographics-file), as described above.
 
 `sessions.tsv` example:
 
-```Text
-participant_id session_id   acq_time            systolic_blood_pressure
-sub-01         ses-predrug  2009-06-15T13:45:30 120
-sub-01         ses-postdrug 2009-06-16T13:45:30 100
-sub-01         ses-followup 2009-06-17T13:45:30 110
-sub-02         ses-predrug  2009-06-22T12:22:05 105
-sub-02         ses-postdrug 2009-06-23T12:22:05 95
-sub-03         ses-postdrug 2009-06-30T14:06:40 115
-sub-03         ses-followup 2009-07-01T14:06:40 120
+```tsv
+participant_id	session_id	acq_time	systolic_blood_pressure
+sub-01	ses-predrug	2009-06-15T13:45:30	120
+sub-01	ses-postdrug	2009-06-16T13:45:30	100
+sub-01	ses-followup	2009-06-17T13:45:30	110
+sub-02	ses-predrug	2009-06-22T12:22:05	105
+sub-02	ses-postdrug	2009-06-23T12:22:05	95
+sub-03	ses-postdrug	2009-06-30T14:06:40	115
+sub-03	ses-followup	2009-07-01T14:06:40	120
 ```
 
 !!! success "Guideline 7"

From 94cb476450dae19bc049c339f0b00ee150d6903a Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Fri, 30 May 2025 08:41:04 -0700
Subject: [PATCH 06/53] Apply suggestions from code review

Correct minor typos in words, headers, and links.
---
 src/appendices/phenotype.md                                   | 2 +-
 src/modality-agnostic-files/phenotypic-and-assessment-data.md | 2 +-
 src/schema/objects/files.yaml                                 | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index 72a5b84eab..a755df56d9 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -61,7 +61,7 @@ as needed according to the smallest unit of acquisition.
 The combination of values in the `participant_id`, `session_id`, and `run` (if present)
 columns MUST be unique for the entire tabular file.
 
-### 4. Add `MeasurementToolMetadata` to each tabular phenotypic measurment tool
+### 4. Add `MeasurementToolMetadata` to each tabular phenotypic measurement tool
 
 Whenever possible, it is RECOMMENDED to add `MeasurementToolMetadata` to
 each `phenotype/<measurement_tool_name>.json` data dictionary.
diff --git a/src/modality-agnostic-files/phenotypic-and-assessment-data.md b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
index fb03553d5a..0f2362cf1f 100644
--- a/src/modality-agnostic-files/phenotypic-and-assessment-data.md
+++ b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
@@ -1,4 +1,4 @@
-## Phenotypic and assessment data
+# Phenotypic and assessment data
 
 Template:
 
diff --git a/src/schema/objects/files.yaml b/src/schema/objects/files.yaml
index ff09767b45..5d9b77a891 100644
--- a/src/schema/objects/files.yaml
+++ b/src/schema/objects/files.yaml
@@ -75,8 +75,8 @@ participants:
     followed by a list of optional columns describing participants.
     Each participant MUST be described by one and only one row.
     For participants with multiple sessions, see
-    the [sessions file](SPEC_ROOT/modality-agnostic-files.md#sessions-file)
-    and [demographics file](SPEC_ROOT/modality-agnostic-files.md#demographics-file) sections.
+    the [sessions file](SPEC_ROOT/modality-agnostic-files/data-summary-files.md#sessions-file)
+    and [demographics file](SPEC_ROOT/modality-agnostic-files/data-summary-files.md#demographics-file) sections.
 
     The `participant_id` entries MUST be a superset of all subject directories
     and all `participant_id` entries found among phenotypic and assessment data

From 142c460bda186494a377d1734337805923e4d5d8 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Fri, 30 May 2025 09:11:08 -0700
Subject: [PATCH 07/53] Apply suggestions from code review

Trying to satisfy `remark-lint`.
---
 src/appendices/phenotype.md                   | 23 +++++++++++--------
 .../phenotypic-and-assessment-data.md         | 11 +++++----
 2 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index a755df56d9..8aca9e4d26 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -29,16 +29,19 @@ aggregated data TSV file in which the user collects all subjects, sessions,
 and/or runs of data as one entry per row (with a row defined by
 the smallest unit of acquisition). In other words:
 
-1. Each row MUST start with `participant_id`.
-2. Each TSV file MUST contain a `session_id` column when
-multiple [sessions](../glossary.md#session-entities)[^1] are present
-in the data set regardless of whether those sessions are in
-the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
-3. If more than one of the same measurement tool is acquired within
-the same `session_id`, a `run` column MUST be added.
-4. To encode the acquisition time for a measurement tool’s `session_id`,
-add the `session_id` to the sessions file and
-include the OPTIONAL `acq_time` column.
+1.  Each row MUST start with `participant_id`.
+
+1.  Each TSV file MUST contain a `session_id` column when
+  multiple [sessions](../glossary.md#session-entities)[^1] are present
+  in the data set regardless of whether those sessions are in
+  the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
+
+1.  If more than one of the same measurement tool is acquired within
+  the same `session_id`, a `run` column MUST be added.
+
+1.  To encode the acquisition time for a measurement tool’s `session_id`,
+  add the `session_id` to the sessions file and
+  include the OPTIONAL `acq_time` column.
 
 To summarize this guideline as a table:
 
diff --git a/src/modality-agnostic-files/phenotypic-and-assessment-data.md b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
index 0f2362cf1f..622d4f746c 100644
--- a/src/modality-agnostic-files/phenotypic-and-assessment-data.md
+++ b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
@@ -35,14 +35,17 @@ an independent aggregated data TSV file in which the user collects
 all subjects, sessions, and/or runs of data as one entry per row
 (with a row defined by the smallest unit of acquisition). In other words:
 
-1. Each row MUST start with `participant_id`.
-2. Each TSV file SHOULD contain a `session_id` column
+1.  Each row MUST start with `participant_id`.
+
+1.  Each TSV file SHOULD contain a `session_id` column
 when multiple [sessions](../glossary.md#session-entities) are present
 in the data set regardless of whether those sessions are in
 the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
-3. If more than one of the same measurement tool is acquired
+
+1.  If more than one of the same measurement tool is acquired
 within the same `session_id`, a `run` column SHOULD be added.
-4. To encode the acquisition time for a measurement tool’s `session_id`,
+
+1.  To encode the acquisition time for a measurement tool’s `session_id`,
 add the `session_id` to the sessions file
 and include the OPTIONAL `acq_time` column.
 

From 8b783598768c77db332d34b490900c8150b61974 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Fri, 30 May 2025 09:13:29 -0700
Subject: [PATCH 08/53] Update
 src/modality-agnostic-files/phenotypic-and-assessment-data.md

Trying to satisfy `remark-lint`.
---
 src/modality-agnostic-files/phenotypic-and-assessment-data.md | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/modality-agnostic-files/phenotypic-and-assessment-data.md b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
index 622d4f746c..0215226cfe 100644
--- a/src/modality-agnostic-files/phenotypic-and-assessment-data.md
+++ b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
@@ -125,8 +125,6 @@ columns.
     Whenever possible, it is RECOMMENDED to add `MeasurementToolMetadata` to
     each `phenotype/<measurement_tool_name>.json` data dictionary.
     This improves reusability and provides clarity about the measurement tool.
-
-
 In addition to the keys available to describe columns in all tabular files
 (`LongName`, `Description`, `Levels`, `Units`, and `TermURL`) the
 `participants.json` file as well as phenotypic files can also include column

From 60f712a92412fd7fd85202b5a66d0a5371421e6a Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Fri, 30 May 2025 12:49:02 -0700
Subject: [PATCH 09/53] Update mkdocs.yml

Added nav to BEP036 phenotype appendix.
---
 mkdocs.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mkdocs.yml b/mkdocs.yml
index 2e5c7b3576..c21722567c 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -46,6 +46,7 @@ nav:
           - Quantitative MRI: appendices/qmri.md
           - Arterial Spin Labeling: appendices/arterial-spin-labeling.md
           - Cross modality correspondence: appendices/cross-modality-correspondence.md
+          - Phenotypic data guidelines: appendices/phenotype.md
       - Changelog: CHANGES.md
   - The BIDS Starter Kit:
       - Website: https://bids-standard.github.io/bids-starter-kit/

From e62b5cc65c29cc53485efe068de59dc6410005b9 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Fri, 30 May 2025 16:55:17 -0700
Subject: [PATCH 10/53] Update
 src/modality-agnostic-files/phenotypic-and-assessment-data.md

Fixing a missed line break.
---
 src/modality-agnostic-files/phenotypic-and-assessment-data.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/modality-agnostic-files/phenotypic-and-assessment-data.md b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
index 0215226cfe..de3bf608db 100644
--- a/src/modality-agnostic-files/phenotypic-and-assessment-data.md
+++ b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
@@ -125,6 +125,7 @@ columns.
     Whenever possible, it is RECOMMENDED to add `MeasurementToolMetadata` to
     each `phenotype/<measurement_tool_name>.json` data dictionary.
     This improves reusability and provides clarity about the measurement tool.
+
 In addition to the keys available to describe columns in all tabular files
 (`LongName`, `Description`, `Levels`, `Units`, and `TermURL`) the
 `participants.json` file as well as phenotypic files can also include column

From ac097aa5347a1c11bbf45151aa5277be85ec4923 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Tue, 24 Jun 2025 14:46:28 -0400
Subject: [PATCH 11/53] Update phenotype.md to have a macro table from schema

This is my first attempt. Hopefully it works?
---
 src/appendices/phenotype.md                   | 13 +++----
 .../rules/tabular_data/modality_agnostic.yaml | 36 +++++++++++++++++++
 2 files changed, 43 insertions(+), 6 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index 8aca9e4d26..92f7957ba9 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -45,12 +45,13 @@ the smallest unit of acquisition). In other words:
 
 To summarize this guideline as a table:
 
-| **Column name**  | **Requirement** | **Description** |
-| :--------------- | :-------------- | :-------------- |
-| `participant_id` | REQUIRED        | MUST be the first column in the file.   Note that data for one participant MAY be represented across multiple rows in case of multiple sessions or runs, and therefore the entry in the `participant_id` column will be repeated. |
-| `session _id`    | CONDITIONAL ; If sessions are defined in the dataset | A `session_id` column MUST be added to all tabular files in the phenotype directory as soon as multiple sessions are present in the data set regardless of whether those sessions are in the  `phenotype/` data, `sub-<label>/` data, or a combination of the two. |
-| `run`            | CONDITIONAL ; If there are multiple runs within any session | A chronological `run` number is used when a measurement tool or assessment described by a tabular file was repeated within a session. |
-| `acq_time`       | OPTIONAL        | If acquisition time is available, the `acq_time` column CAN be used to record the time of acquisition of each row in the tabular file. |
+<!-- This block generates a columns table.
+The definitions of these fields can be found in
+  src/schema/rules/tabular_data/*.yaml
+and a guide for using macros can be found at
+ https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
+-->
+{{ MACROS___make_columns_table("modality_agnostic.Phenotype") }}
 
 Furthermore, if you have to add a `session_id` column to the
 tabular phenotypic data, you then MUST also introduce a session directory to the
diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index fcaa4b32d4..b9cfdf108a 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -18,6 +18,42 @@ Participants:
   index_columns: [participant_id]
   additional_columns: allowed
 
+Phenotype:
+  selectors:
+    - extension == ".tsv"
+  initial_columns:
+    - participant_id
+  columns:
+    participant_id:
+      level: required
+      description_addendum: |
+        MUST be the first column in the file.
+        Note that data for one participant MAY be represented across multiple rows
+        in case of multiple sessions or runs, and
+        therefore the entry in the `participant_id` column will be repeated.
+    session_id:
+      level: optional
+      description_addendum: |
+        REQUIRED if sessions are defined in the dataset.
+        A `session_id` column MUST be added to all tabular files in the phenotype directory
+        as soon as multiple sessions are present in the data set
+        regardless of whether those sessions are in the
+        `phenotype/` data, `sub-<label>/` data, or a combination of the two.
+    run:
+      level: optional
+      description_addendum: |
+        REQUIRED if there are multiple runs within any session.
+        A chronological `run` number is used when
+        a measurement tool or assessment described by a tabular file
+        was repeated within a session.
+    acq_time__phenotype:
+      level: optional
+      description_addendum: |
+        If acquisition time is available, the `acq_time` column CAN be used
+        to record the time of acquisition of each row in the tabular file.
+  index_columns: [participant_id, session_id, run, acq_time__phenotype]
+  additional_columns: allowed
+
 Samples:
   selectors:
     - path == "/samples.tsv"

From 32fedd01dec4fd921c3c3628e4ba080821797ad6 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Tue, 24 Jun 2025 14:51:27 -0400
Subject: [PATCH 12/53] Update
 src/schema/rules/tabular_data/modality_agnostic.yaml

Attempt 2 to satisfy CircleCI.
---
 src/schema/rules/tabular_data/modality_agnostic.yaml | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index b9cfdf108a..0d061a137c 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -31,7 +31,7 @@ Phenotype:
         Note that data for one participant MAY be represented across multiple rows
         in case of multiple sessions or runs, and
         therefore the entry in the `participant_id` column will be repeated.
-    session_id:
+    session_id__phenotype:
       level: optional
       description_addendum: |
         REQUIRED if sessions are defined in the dataset.
@@ -39,7 +39,7 @@ Phenotype:
         as soon as multiple sessions are present in the data set
         regardless of whether those sessions are in the
         `phenotype/` data, `sub-<label>/` data, or a combination of the two.
-    run:
+    run__phenotype:
       level: optional
       description_addendum: |
         REQUIRED if there are multiple runs within any session.
@@ -51,7 +51,12 @@ Phenotype:
       description_addendum: |
         If acquisition time is available, the `acq_time` column CAN be used
         to record the time of acquisition of each row in the tabular file.
-  index_columns: [participant_id, session_id, run, acq_time__phenotype]
+  index_columns: [
+    participant_id,
+    session_id__phenotype,
+    run__phenotype,
+    acq_time__phenotype
+  ]
   additional_columns: allowed
 
 Samples:

From aacda9bdb5a07cd224248871f6b5f827aaabfe7e Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Tue, 24 Jun 2025 14:54:31 -0400
Subject: [PATCH 13/53] Update modality_agnostic.yaml

Attempt 3 to satisfy CircleCI.
---
 src/schema/rules/tabular_data/modality_agnostic.yaml | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index 0d061a137c..463f9d5850 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -51,12 +51,7 @@ Phenotype:
       description_addendum: |
         If acquisition time is available, the `acq_time` column CAN be used
         to record the time of acquisition of each row in the tabular file.
-  index_columns: [
-    participant_id,
-    session_id__phenotype,
-    run__phenotype,
-    acq_time__phenotype
-  ]
+  index_columns: [participant_id, session_id__phenotype, run__phenotype, acq_time__phenotype]
   additional_columns: allowed
 
 Samples:

From fd5ff2d5fe80b833937988441c85fc102e3581d6 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Tue, 24 Jun 2025 15:12:40 -0400
Subject: [PATCH 14/53] Update phenotype.md appendix and modality_agnostic.yaml
 schema

Attempt 4? to make macro table happy in the schema.
---
 src/appendices/phenotype.md                          | 2 +-
 src/schema/rules/tabular_data/modality_agnostic.yaml | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index 92f7957ba9..66a96cd6b4 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -51,7 +51,7 @@ The definitions of these fields can be found in
 and a guide for using macros can be found at
  https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
 -->
-{{ MACROS___make_columns_table("modality_agnostic.Phenotype") }}
+{{ MACROS___make_columns_table("modality_agnostic.Phenotypes") }}
 
 Furthermore, if you have to add a `session_id` column to the
 tabular phenotypic data, you then MUST also introduce a session directory to the
diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index 463f9d5850..f4fa6aeef9 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -18,7 +18,7 @@ Participants:
   index_columns: [participant_id]
   additional_columns: allowed
 
-Phenotype:
+Phenotypes:
   selectors:
     - extension == ".tsv"
   initial_columns:
@@ -31,7 +31,7 @@ Phenotype:
         Note that data for one participant MAY be represented across multiple rows
         in case of multiple sessions or runs, and
         therefore the entry in the `participant_id` column will be repeated.
-    session_id__phenotype:
+    session_id:
       level: optional
       description_addendum: |
         REQUIRED if sessions are defined in the dataset.
@@ -51,7 +51,7 @@ Phenotype:
       description_addendum: |
         If acquisition time is available, the `acq_time` column CAN be used
         to record the time of acquisition of each row in the tabular file.
-  index_columns: [participant_id, session_id__phenotype, run__phenotype, acq_time__phenotype]
+  index_columns: [participant_id, session_id, run__phenotype, acq_time__phenotype]
   additional_columns: allowed
 
 Samples:

From dd65b5ea66513e5be2f66e7d06c31c1932fd7182 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Tue, 24 Jun 2025 15:15:23 -0400
Subject: [PATCH 15/53] Update modality_agnsotic.yaml

Attempt 5? to satisfy CircleCI, etc.
---
 src/schema/rules/tabular_data/modality_agnostic.yaml | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index f4fa6aeef9..12378ea271 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -39,19 +39,19 @@ Phenotypes:
         as soon as multiple sessions are present in the data set
         regardless of whether those sessions are in the
         `phenotype/` data, `sub-<label>/` data, or a combination of the two.
-    run__phenotype:
+    run:
       level: optional
       description_addendum: |
         REQUIRED if there are multiple runs within any session.
         A chronological `run` number is used when
         a measurement tool or assessment described by a tabular file
         was repeated within a session.
-    acq_time__phenotype:
+    acq_time:
       level: optional
       description_addendum: |
         If acquisition time is available, the `acq_time` column CAN be used
         to record the time of acquisition of each row in the tabular file.
-  index_columns: [participant_id, session_id, run__phenotype, acq_time__phenotype]
+  index_columns: [participant_id, session_id, run, acq_time]
   additional_columns: allowed
 
 Samples:

From f4205e8291bcd5aa30ad4ca66beee73774396a52 Mon Sep 17 00:00:00 2001
From: Ross Blair <rosswilsonblair@gmail.com>
Date: Thu, 17 Jul 2025 11:31:21 -0500
Subject: [PATCH 16/53] add missing column objects, use existing acq column
 definition (#4)

Thank you, Ross!
---
 src/schema/objects/columns.yaml                      | 7 +++++++
 src/schema/rules/tabular_data/modality_agnostic.yaml | 4 ++--
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/schema/objects/columns.yaml b/src/schema/objects/columns.yaml
index f98b2e539d..f4728e724a 100644
--- a/src/schema/objects/columns.yaml
+++ b/src/schema/objects/columns.yaml
@@ -429,6 +429,13 @@ response_time:
     `n/a` denotes a missed response.
   type: number
   unit: s
+run:
+  name: run
+  display_name: Run
+  description: |
+    A run index that corresponds to an existing `run-<index>` entity used in in a filename(s).
+  type: string
+  format: index
 sample_id:
   name: sample_id
   display_name: Sample ID
diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index 12378ea271..f3cfce7ec3 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -46,12 +46,12 @@ Phenotypes:
         A chronological `run` number is used when
         a measurement tool or assessment described by a tabular file
         was repeated within a session.
-    acq_time:
+    acq_time__scans:
       level: optional
       description_addendum: |
         If acquisition time is available, the `acq_time` column CAN be used
         to record the time of acquisition of each row in the tabular file.
-  index_columns: [participant_id, session_id, run, acq_time]
+  index_columns: [participant_id, session_id, run, acq_time__scans]
   additional_columns: allowed
 
 Samples:

From 0eba71d0ab55d73ac32f88abad37bf73bdbbb5fb Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Thu, 17 Jul 2025 09:50:05 -0700
Subject: [PATCH 17/53] Updates for BEP036

- Updating run to run_id in phenotype appendix
- Updating the table to a MACRO inthe phenotype agnostic files section
- Updating run to run_id in columns.yaml
- Updating run to run_id in modality_agnostic.yaml
---
 .vscode/settings.json                                 |  5 +++++
 src/appendices/phenotype.md                           |  4 ++--
 .../phenotypic-and-assessment-data.md                 | 11 +++--------
 src/schema/objects/columns.yaml                       | 10 +++++-----
 src/schema/rules/tabular_data/modality_agnostic.yaml  |  4 ++--
 5 files changed, 17 insertions(+), 17 deletions(-)
 create mode 100644 .vscode/settings.json

diff --git a/.vscode/settings.json b/.vscode/settings.json
new file mode 100644
index 0000000000..058954976e
--- /dev/null
+++ b/.vscode/settings.json
@@ -0,0 +1,5 @@
+{
+    "githubPullRequests.ignoredPullRequestBranches": [
+        "master"
+    ]
+}
\ No newline at end of file
diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index 66a96cd6b4..0d8bbbf826 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -37,7 +37,7 @@ the smallest unit of acquisition). In other words:
   the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
 
 1.  If more than one of the same measurement tool is acquired within
-  the same `session_id`, a `run` column MUST be added.
+  the same `session_id`, a `run_id` column MUST be added.
 
 1.  To encode the acquisition time for a measurement tool’s `session_id`,
   add the `session_id` to the sessions file and
@@ -62,7 +62,7 @@ all imaging data and tabular phenotypic data MUST have sessions.
 
 This produces a file in which same-participant entries can take up as many rows
 as needed according to the smallest unit of acquisition.
-The combination of values in the `participant_id`, `session_id`, and `run` (if present)
+The combination of values in the `participant_id`, `session_id`, and `run_id` (if present)
 columns MUST be unique for the entire tabular file.
 
 ### 4. Add `MeasurementToolMetadata` to each tabular phenotypic measurement tool
diff --git a/src/modality-agnostic-files/phenotypic-and-assessment-data.md b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
index de3bf608db..1be78e0cf2 100644
--- a/src/modality-agnostic-files/phenotypic-and-assessment-data.md
+++ b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
@@ -43,7 +43,7 @@ in the data set regardless of whether those sessions are in
 the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
 
 1.  If more than one of the same measurement tool is acquired
-within the same `session_id`, a `run` column SHOULD be added.
+within the same `session_id`, a `run_id` column SHOULD be added.
 
 1.  To encode the acquisition time for a measurement tool’s `session_id`,
 add the `session_id` to the sessions file
@@ -53,16 +53,11 @@ and include the OPTIONAL `acq_time` column.
 
     For [best tabular phenotypic data](../appendices/phenotype.md):
 
-    | **Column name**  | **Requirement** | **Description** |
-    | :--------------- | :-------------- | :-------------- |
-    | `participant_id` | REQUIRED        | MUST be the first column in the file. Note that data for one participant MAY be represented across multiple rows in case of multiple sessions or runs, and therefore the entry in the `participant_id` column will be repeated. |
-    | `session _id`    | CONDITIONAL ; If sessions are defined in the dataset | A `session_id` column MUST be added to all tabular files in the phenotype directory as soon as multiple sessions are present in the data set regardless of whether those sessions are in the  `phenotype/` data, `sub-<label>/` data, or a combination of the two. |
-    | `run`            | CONDITIONAL ; If there are multiple runs within any session | A chronological `run` number is used when a measurement tool or assessment described by a tabular file was repeated within a session. |
-    | `acq_time`       | OPTIONAL        | If acquisition time is available, the `acq_time` column CAN be used to record the time of acquisition of each row in the tabular file. |
+    {{ MACROS___make_columns_table("modality_agnostic.Phenotypes") }}
 
     Furthermore, if you have to add a `session_id` column to the tabular phenotypic data, you then MUST also introduce a session directory to the imaging data, even if only one imaging session has been created. This rule can be considered as "**if anyone uses sessions, everyone uses sessions**." And vice versa, if imaging data has session directories, all imaging data and tabular phenotypic data MUST have sessions.
 
-    This produces a file in which same-participant entries can take up as many rows as needed according to the smallest unit of acquisition. The combination of values in the `participant_id`, `session_id`, and `run` (if present) columns MUST be unique for the entire tabular file.
+    This produces a file in which same-participant entries can take up as many rows as needed according to the smallest unit of acquisition. The combination of values in the `participant_id`, `session_id`, and `run_id` (if present) columns MUST be unique for the entire tabular file.
 
 As with all other tabular data, the additional phenotypic information files
 MAY be accompanied by a JSON file describing the columns in detail
diff --git a/src/schema/objects/columns.yaml b/src/schema/objects/columns.yaml
index f4728e724a..8c0cbe6c2c 100644
--- a/src/schema/objects/columns.yaml
+++ b/src/schema/objects/columns.yaml
@@ -429,13 +429,13 @@ response_time:
     `n/a` denotes a missed response.
   type: number
   unit: s
-run:
-  name: run
-  display_name: Run
+run_id:
+  name: run_id
+  display_name: Run ID
   description: |
-    A run index that corresponds to an existing `run-<index>` entity used in in a filename(s).
+    A run identifier that corresponds to an existing `run-<index>` entity used in a filename(s).
   type: string
-  format: index
+  pattern: ^run-[0-9]+$
 sample_id:
   name: sample_id
   display_name: Sample ID
diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index f3cfce7ec3..de39e4ceaf 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -39,7 +39,7 @@ Phenotypes:
         as soon as multiple sessions are present in the data set
         regardless of whether those sessions are in the
         `phenotype/` data, `sub-<label>/` data, or a combination of the two.
-    run:
+    run_id:
       level: optional
       description_addendum: |
         REQUIRED if there are multiple runs within any session.
@@ -51,7 +51,7 @@ Phenotypes:
       description_addendum: |
         If acquisition time is available, the `acq_time` column CAN be used
         to record the time of acquisition of each row in the tabular file.
-  index_columns: [participant_id, session_id, run, acq_time__scans]
+  index_columns: [participant_id, session_id, run_id, acq_time__scans]
   additional_columns: allowed
 
 Samples:

From d3631a8bb6f89c1a97082a93c434143771353dff Mon Sep 17 00:00:00 2001
From: "pre-commit-ci[bot]"
 <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Date: Thu, 17 Jul 2025 16:50:30 +0000
Subject: [PATCH 18/53] [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci
---
 .vscode/settings.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.vscode/settings.json b/.vscode/settings.json
index 058954976e..3a704128f1 100644
--- a/.vscode/settings.json
+++ b/.vscode/settings.json
@@ -2,4 +2,4 @@
     "githubPullRequests.ignoredPullRequestBranches": [
         "master"
     ]
-}
\ No newline at end of file
+}

From f4939ad7b2309884f379806f637416c05504ee6c Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Thu, 17 Jul 2025 10:11:59 -0700
Subject: [PATCH 19/53] Updates phentoype.md and Guideline 3 in the modality
 agmpstic section

- Added a heading 4 to the appendix
- Removed the table and instead anchored a link to the appendix table
---
 src/appendices/phenotype.md                   |  2 +-
 .../phenotypic-and-assessment-data.md         | 35 ++++++++++++++++---
 2 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index 0d8bbbf826..dc3d678391 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -43,7 +43,7 @@ the smallest unit of acquisition). In other words:
   add the `session_id` to the sessions file and
   include the OPTIONAL `acq_time` column.
 
-To summarize this guideline as a table:
+#### To summarize this guideline as a table
 
 <!-- This block generates a columns table.
 The definitions of these fields can be found in
diff --git a/src/modality-agnostic-files/phenotypic-and-assessment-data.md b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
index 1be78e0cf2..9af867b4fa 100644
--- a/src/modality-agnostic-files/phenotypic-and-assessment-data.md
+++ b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
@@ -53,11 +53,36 @@ and include the OPTIONAL `acq_time` column.
 
     For [best tabular phenotypic data](../appendices/phenotype.md):
 
-    {{ MACROS___make_columns_table("modality_agnostic.Phenotypes") }}
-
-    Furthermore, if you have to add a `session_id` column to the tabular phenotypic data, you then MUST also introduce a session directory to the imaging data, even if only one imaging session has been created. This rule can be considered as "**if anyone uses sessions, everyone uses sessions**." And vice versa, if imaging data has session directories, all imaging data and tabular phenotypic data MUST have sessions.
-
-    This produces a file in which same-participant entries can take up as many rows as needed according to the smallest unit of acquisition. The combination of values in the `participant_id`, `session_id`, and `run_id` (if present) columns MUST be unique for the entire tabular file.
+    Each measurement tool SHOULD have an independent
+    aggregated data TSV file in which the user collects all subjects, sessions,
+    and/or runs of data as one entry per row (with a row defined by
+    the smallest unit of acquisition). In other words:
+
+    1.  Each row MUST start with `participant_id`.
+    1.  Each TSV file MUST contain a `session_id` column when
+      multiple [sessions](../glossary.md#session-entities) are present
+      in the data set regardless of whether those sessions are in
+      the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
+    1.  If more than one of the same measurement tool is acquired within
+      the same `session_id`, a `run_id` column MUST be added.
+    1.  To encode the acquisition time for a measurement tool’s `session_id`,
+      add the `session_id` to the sessions file and
+      include the OPTIONAL `acq_time` column.
+
+    To see this guideline summarized as a table,
+    see [the appendix](../appendices/phenotype.md#to-summarize-this-guideline-as-a-table).
+
+    Furthermore, if you have to add a `session_id` column to the tabular phenotypic data,
+    you then MUST also introduce a session directory to the imaging data,
+    even if only one imaging session has been created.
+    This rule can be considered as "**if anyone uses sessions, everyone uses sessions**."
+    And vice versa, if imaging data has session directories,
+    all imaging data and tabular phenotypic data MUST have sessions.
+
+    This produces a file in which same-participant entries can take up as many rows as needed
+    according to the smallest unit of acquisition.
+    The combination of values in the `participant_id`, `session_id`, and `run_id` (if present)
+    columns MUST be unique for the entire tabular file.
 
 As with all other tabular data, the additional phenotypic information files
 MAY be accompanied by a JSON file describing the columns in detail

From abd5c2bd679da721d4dc23830c0f335264a88951 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Thu, 17 Jul 2025 10:28:52 -0700
Subject: [PATCH 20/53] Update modality_agnostic.yaml

- Change description_addendum to just description for Phenotypes table
- Add level_addendum for session_id and run_id
---
 src/schema/rules/tabular_data/modality_agnostic.yaml | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index de39e4ceaf..fe85fee5aa 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -26,14 +26,15 @@ Phenotypes:
   columns:
     participant_id:
       level: required
-      description_addendum: |
+      description: |
         MUST be the first column in the file.
         Note that data for one participant MAY be represented across multiple rows
         in case of multiple sessions or runs, and
         therefore the entry in the `participant_id` column will be repeated.
     session_id:
       level: optional
-      description_addendum: |
+      level_addendum: required if sessions are defined in the dataset
+      description: |
         REQUIRED if sessions are defined in the dataset.
         A `session_id` column MUST be added to all tabular files in the phenotype directory
         as soon as multiple sessions are present in the data set
@@ -41,14 +42,15 @@ Phenotypes:
         `phenotype/` data, `sub-<label>/` data, or a combination of the two.
     run_id:
       level: optional
-      description_addendum: |
+      level_addendum: required if there are multiple runs within any session
+      description: |
         REQUIRED if there are multiple runs within any session.
         A chronological `run` number is used when
         a measurement tool or assessment described by a tabular file
         was repeated within a session.
     acq_time__scans:
       level: optional
-      description_addendum: |
+      description: |
         If acquisition time is available, the `acq_time` column CAN be used
         to record the time of acquisition of each row in the tabular file.
   index_columns: [participant_id, session_id, run_id, acq_time__scans]

From ec2c53dba780b029ba0c8903e1361abcad01290c Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Thu, 17 Jul 2025 10:39:02 -0700
Subject: [PATCH 21/53] Update phenotypic-and-assessment_data.md

Removed a duplicated section between Guidlines 2 and 3
---
 .../phenotypic-and-assessment-data.md         | 19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/src/modality-agnostic-files/phenotypic-and-assessment-data.md b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
index 9af867b4fa..d460b77975 100644
--- a/src/modality-agnostic-files/phenotypic-and-assessment-data.md
+++ b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
@@ -30,25 +30,6 @@ in the BIDS dataset and `participants.tsv` file.
     It is REQUIRED to aggregate all participant data into
     one TSV per tabular phenotypic file.
 
-In phenotypic and assessment data each measurement tool has
-an independent aggregated data TSV file in which the user collects
-all subjects, sessions, and/or runs of data as one entry per row
-(with a row defined by the smallest unit of acquisition). In other words:
-
-1.  Each row MUST start with `participant_id`.
-
-1.  Each TSV file SHOULD contain a `session_id` column
-when multiple [sessions](../glossary.md#session-entities) are present
-in the data set regardless of whether those sessions are in
-the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
-
-1.  If more than one of the same measurement tool is acquired
-within the same `session_id`, a `run_id` column SHOULD be added.
-
-1.  To encode the acquisition time for a measurement tool’s `session_id`,
-add the `session_id` to the sessions file
-and include the OPTIONAL `acq_time` column.
-
 !!! success "Guideline 3"
 
     For [best tabular phenotypic data](../appendices/phenotype.md):

From d1141a06761e82eff59864af888fbe3582bb89fd Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Wed, 17 Sep 2025 05:00:03 -0700
Subject: [PATCH 22/53] Updates to remove demographics file and add
 AdditionalValidation field

Checkpoint in spec text supporting phenotype AdditionalValidation prior to updating pheno examples.
---
 src/appendices/phenotype.md                   |  77 ++++++------
 src/common-principles.md                      |   8 +-
 .../data-summary-files.md                     | 107 ++++++++---------
 .../dataset-description.md                    |  14 +++
 .../phenotypic-and-assessment-data.md         | 110 +++++++++---------
 src/schema/objects/files.yaml                 |  14 ++-
 src/schema/objects/metadata.yaml              |  13 +++
 src/schema/rules/dataset_metadata.yaml        |   1 +
 .../rules/tabular_data/modality_agnostic.yaml |   3 +-
 9 files changed, 188 insertions(+), 159 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index dc3d678391..ab05cdbcbd 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -1,30 +1,37 @@
 # Tabular phenotypic data guidelines
 
-This appendix is a collection of guidelines and examples for creating well-organized aggregated tabular phenotypic data.
+This appendix is a collection of guidelines and examples
+for creating well-organized aggregated tabular phenotypic data.
 
 ## Guidelines
 
-These guidelines are all **RECOMMENDED** when preparing
-tabular phenotypic data like the
-participants file, sessions file, demographics file,
-or phenotypic and assessment data.
-The language below uses REQUIRED, MUST, and others to imply
-these are the requirements for these **RECOMMENDED** guidelines.
+These guidelines all apply when the
+[`AdditionalValidation` key](dataset-description.md#additional-validation)
+contains `"Phenotype"` in the `dataset_description.json`.
+They are intended to improve the organization and clarity of
+tabular phenotypic data like the participants file, sessions file,
+and phenotypic and assessment data.
 
-### 1. Always pair tabular data with data dictionaries
+### 1. Aggregate data across sessions
+
+Aggregation refers to the contents of the TSV file. It is REQUIRED
+to collect all participant data into one TSV per tabular phenotypic file.
+
+### 2. Always pair tabular data with data dictionaries
 
 Tabular phenotypic data MUST be prepared as one pair of a tabular file
 in tab-separated value (TSV) format and a corresponding data dictionary
 in JavaScript Object Notation (JSON) format.
 
-### 2. Aggregate data across sessions
+### 3. Add `MeasurementToolMetadata` to each tabular phenotypic measurement tool
 
-Aggregation refers to the contents of the TSV file. It is REQUIRED
-to collect all participant data into one TSV per tabular phenotypic file.
+Whenever possible, it is RECOMMENDED to add `MeasurementToolMetadata` to
+each `phenotype/<measurement_tool_name>.json` data dictionary.
+This improves reusability and provides clarity about the measurement tool.
 
-### 3. Ensure minimal annotation for phenotypic and assessment data
+### 4. Ensure minimal annotation for phenotypic and assessment data
 
-In phenotypic and assessment data each measurement tool has an independent
+In phenotypic and assessment data each measurement tool SHOULD have an independent
 aggregated data TSV file in which the user collects all subjects, sessions,
 and/or runs of data as one entry per row (with a row defined by
 the smallest unit of acquisition). In other words:
@@ -32,16 +39,16 @@ the smallest unit of acquisition). In other words:
 1.  Each row MUST start with `participant_id`.
 
 1.  Each TSV file MUST contain a `session_id` column when
-  multiple [sessions](../glossary.md#session-entities)[^1] are present
-  in the data set regardless of whether those sessions are in
-  the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
+    multiple [sessions](../glossary.md#session-entities)[^1] are present
+    in the data set regardless of whether those sessions are in
+    the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
 
 1.  If more than one of the same measurement tool is acquired within
-  the same `session_id`, a `run_id` column MUST be added.
+    the same `session_id`, a `run_id` column MUST be added.
 
 1.  To encode the acquisition time for a measurement tool’s `session_id`,
-  add the `session_id` to the sessions file and
-  include the OPTIONAL `acq_time` column.
+    add the `session_id` to the sessions file and
+    include the OPTIONAL `acq_time` column.
 
 #### To summarize this guideline as a table
 
@@ -65,42 +72,31 @@ as needed according to the smallest unit of acquisition.
 The combination of values in the `participant_id`, `session_id`, and `run_id` (if present)
 columns MUST be unique for the entire tabular file.
 
-### 4. Add `MeasurementToolMetadata` to each tabular phenotypic measurement tool
-
-Whenever possible, it is RECOMMENDED to add `MeasurementToolMetadata` to
-each `phenotype/<measurement_tool_name>.json` data dictionary.
-This improves reusability and provides clarity about the measurement tool.
-
-### 5. Use the demographics file for common variables about participants
-
-Some studies collect demographics into their own tabular phenotypic data file already.
-In these cases, it is RECOMMENDED to house this data in the `phenotype/` directory
-as a TSV called `demographics.tsv` and its corresponding data dictionary JSON
-called `demographics.json`.
-
-### 6. Store longitudinal age in the demographics file
+### 5. Store longitudinal age in the participants file
 
 It is RECOMMENDED to use the `age` column to record participant age
 at every session in longitudinal or multi-session data sets.
 This reduces data duplication across tabular data files. The `Units` of `age`
 do not have to be years so long as the units of the age
-are written in `phenotype/demographics.json`.
+are written in `participants.json`.
 Consider participant privacy or study objectives when selecting
 the `Units` of `age` or the accuracy of `age` data.
 
-### 7. Use the sessions file at the root level
+### 6. Use the sessions file at the root-level
 
 If there is more than one session for any one participant, then
 it is REQUIRED to provide a sessions file at the dataset root.
 The sessions file MUST list all sessions for all subjects across
 imaging and tabular phenotypic data.
-
-When a sessions file is in use, you MUST NOT provide additional sessions files
-at the participant-level which would otherwise use the inheritance principle.
 If a sessions file is provided, then it MUST begin with a `participant_id` column
 followed immediately by a `session_id` column. The data dictionary JSON file’s
 `session_id` field MUST include `Levels` with the description of each `session_id`.
 
+### 7. Use either root-level sessions file or participant-level sessions files
+
+When a sessions file is in use, you MUST NOT provide additional sessions files
+at the participant-level which would otherwise use the inheritance principle.
+
 ### 8. Record acquisition time of sessions with `acq_time`
 
 Whenever possible, it is RECOMMENDED to also collect acquisition time for
@@ -109,6 +105,8 @@ inside a column named `acq_time` in the sessions file.
 This is consistent with how acquisition time is recorded for MRI data
 and other time-sensitive measurements (for example systolic blood pressure).
 
+### 9. Respect participant privacy when recording acquisition times
+
 When needed to preserve participant privacy, you SHOULD record
 relative acquisition times with respect to the earliest session.
 Relative session acquisition times MAY be listed as durations from
@@ -117,7 +115,7 @@ using the `acq_time` column.
 
 ## Summary
 
-This appendix described seven guidelines for best tabular phenotypic data.
+This appendix described guidelines for best tabular phenotypic data.
 A short summary table here describes when to use which files.
 
 | File                           | Single session data | Multiple session data |
@@ -125,7 +123,6 @@ A short summary table here describes when to use which files.
 | Participants                   | RECOMMENDED         | RECOMMENDED           |
 | Phenotypic and assessment data | RECOMMENDED         | RECOMMENDED           |
 | Sessions                       | OPTIONAL            | REQUIRED              |
-| Demographics                   | OPTIONAL            | RECOMMENDED           |
 
 ## Examples
 
diff --git a/src/common-principles.md b/src/common-principles.md
index fb0812cbcc..e9948b0b98 100644
--- a/src/common-principles.md
+++ b/src/common-principles.md
@@ -525,7 +525,7 @@ onset	duration	response_time	trial_type	trial_extra
     are not part of the tabular data file's content.
 
 Tabular files MAY be optionally accompanied by a simple data dictionary
-in the form of a JSON [object](https://www.json.org/json-en.html)
+in the form of a [JSON object](https://www.json.org/json-en.html)
 within a JSON file.
 The JSON files containing the data dictionaries MUST have the same name as
 their corresponding tabular files but with `.json` extensions.
@@ -536,12 +536,6 @@ Note that if a field name included in the data dictionary matches a column name
 then that field MUST contain a description of the corresponding column,
 using an object containing the following fields:
 
-!!! success "Guideline 1"
-
-    For [best tabular phenotypic data](./appendices/phenotype.md):
-    Each tabular phenotypic data TSV file MUST be accompanied by
-    a corresponding data dictionary JSON file.
-
 <!-- This block generates a metadata table.
 The definitions of these fields can be found in
   src/schema/objects/metadata.yaml
diff --git a/src/modality-agnostic-files/data-summary-files.md b/src/modality-agnostic-files/data-summary-files.md
index c2b8fe80b2..a10b5782b9 100644
--- a/src/modality-agnostic-files/data-summary-files.md
+++ b/src/modality-agnostic-files/data-summary-files.md
@@ -53,6 +53,9 @@ to date of birth.
 
 ```JSON
 {
+    "participant_id": {
+        "Description": "participant identifier"
+    },
     "age": {
         "Description": "age of the participant",
         "Units": "year"
@@ -81,6 +84,14 @@ to date of birth.
 }
 ```
 
+It is RECOMMENDED to use the `age` column to record participant age
+at every session in longitudinal or multi-session data sets.
+This reduces data duplication across tabular data files. The `Units` of `age`
+do not have to be years so long as the units of the age
+are written in `participants.json`.
+Consider participant privacy or study objectives when selecting
+the `Units` of `age` or the accuracy of `age` data.
+
 ## Samples file
 
 Template:
@@ -132,43 +143,6 @@ It is RECOMMENDED to accompany each `samples.tsv` file with a sidecar
 }
 ```
 
-## Demographics file
-
-Template:
-
-```Text
-phenotype/
-    demographics.tsv
-    demographics.json
-```
-
-The demographics file is an OPTIONAL tabular phenotypic file in
-the `phenotype/` directory meant to house common subject demographics.
-For example demographics like age, gender, race, and household income.
-A demographics file is RECOMMENDED to use when any participant has
-more than one session of any type.
-It does not replace the participants file, which is meant for unchanging data about
-each participant in the data set. It is instead a superset of the participants file,
-centralizing demographics across as many sessions as are available.
-
-!!! success "Guideline 5"
-
-    For [best tabular phenotypic data](../appendices/phenotype.md):
-    Some studies collect demographics into their own
-    tabular phenotypic data file already. In these cases, it is RECOMMENDED
-    to house this data also in the demographics file.
-
-!!! success "Guideline 6"
-
-    For [best tabular phenotypic data](../appendices/phenotype.md):
-    It is RECOMMENDED to use the `age` column to record participant age
-    at every session in longitudinal or multi-session data sets.
-    This reduces data duplication across tabular data files. The `Units` of `age`
-    do not have to be years so long as the units of the age
-    are written in `phenotype/demographics.json`.
-    Consider participant privacy or study objectives when selecting
-    the `Units` of `age` or the accuracy of `age` data.
-
 ## Scans file
 
 Template:
@@ -236,12 +210,12 @@ meg/sub-control01_task-rest_split-02_meg.nii.gz	1877-06-15T12:15:27
 
 ## Sessions file
 
-Template A (segregated sessions files):
+### Option 1: Segregated sessions files
 
 ```Text
-[sessions.json]
 sub-<label>/
     sub-<label>_sessions.tsv
+    [sub-<label>_sessions.json]
 ```
 
 Optional: Yes
@@ -270,7 +244,7 @@ ses-postdrug	2009-06-16T13:45:30	100
 ses-followup	2009-06-17T13:45:30	110
 ```
 
-Template B (aggregated sessions file):
+### Option 2: Aggregated sessions file
 
 ```Text
 sessions.tsv
@@ -285,7 +259,7 @@ a `participant_id` column followed immediately after by a `session_id` column.
 The intent of this root-level sessions file is to describe the sessions
 in a data set and non-demographic variables changing between sessions.
 Participant's demographic variables should be added to
-a [demographics file](#demographics-file), as described above.
+the [participants file](#participants-file), as described above.
 
 `sessions.tsv` example:
 
@@ -300,32 +274,59 @@ sub-03	ses-postdrug	2009-06-30T14:06:40	115
 sub-03	ses-followup	2009-07-01T14:06:40	120
 ```
 
-!!! success "Guideline 7"
+`sessions.json` example:
+
+```JSON
+{
+    "participant_id": {
+        "Description": "Participant identifier"
+    },
+    "session_id": {
+        "Description": "Session identifier for the session",
+        "Levels": {
+            "ses-predrug": "session before drug administration",
+            "ses-postdrug": "session after drug administration",
+            "ses-followup": "follow-up session"
+        }
+    },
+    "acq_time": {
+        "Description": "Acquisition time of the session"
+    },
+    "systolic_blood_pressure": {
+        "Description": "Systolic blood pressure measured at the beginning of the session in mmHg"
+    }
+}
+```
+
+### Additional validation
+
+When the [`AdditionalValidation` key](dataset-description.md#additional-validation)
+contains `"Phenotype"` in the `dataset_description.json`,
+the following expectations apply to sessions files.
 
-    For [best tabular phenotypic data](../appendices/phenotype.md):
-    If there is more than one session for any one participant, then it is
+1.  If there is more than one session for any one participant, then it is
     REQUIRED to provide a sessions file at the dataset root.
     The sessions file MUST list all sessions for all subjects
-    across imaging and tabular phenotypic data.
-
-    When a sessions file is in use, you MUST NOT provide additional sessions
-    files at the participant-level which would otherwise use
-    the inheritance principle. If a sessions file is provided, then
+    across imaging and tabular phenotypic data. If a sessions file is provided, then
     it MUST begin with a `participant_id` column followed immediately by
     a `session_id` column. The data dictionary JSON file's `session_id` field
     MUST include `Levels` with the description of each `session_id`.
 
-!!! success "Guideline 8"
+1.  When a sessions file is in use, you MUST NOT provide additional sessions
+    files at the participant-level which would otherwise use
+    the inheritance principle.
 
-    For [best tabular phenotypic data](../appendices/phenotype.md):
-    Whenever possible, it is RECOMMENDED to also collect acquisition time
+1.  Whenever possible, it is RECOMMENDED to also collect acquisition time
     for tabular phenotypic data and store the time of acquisition of each row
     inside a column named `acq_time` in the sessions file.
     This is consistent with how acquisition time is recorded for MRI data
     and other time-sensitive measurements (for example systolic blood pressure).
 
-    When it is needed to preserve participant privacy, you SHOULD record
+1.  When it is needed to preserve participant privacy, you SHOULD record
     relative acquisition times with respect to the earliest session.
     Relative session acquisition times MAY be listed as durations from
     the earliest session (baseline) in days, months, or years
     using the `acq_time` column.
+
+To read more about the guidelines for tabular phenotypic data and examples,
+see the [Tabular phenotypic data guidelines appendix](../appendices/phenotype.md).
diff --git a/src/modality-agnostic-files/dataset-description.md b/src/modality-agnostic-files/dataset-description.md
index 79d75264c3..c9e73b11cd 100644
--- a/src/modality-agnostic-files/dataset-description.md
+++ b/src/modality-agnostic-files/dataset-description.md
@@ -42,6 +42,7 @@ and a guide for using macros can be found at
       "DatasetDOI": "OPTIONAL",
       "GeneratedBy": "RECOMMENDED",
       "SourceDatasets": "RECOMMENDED",
+      "AdditionalValidation": "OPTIONAL",
    }
 ) }}
 
@@ -164,6 +165,19 @@ Example:
 }
 ```
 
+### Additional validation
+
+The `AdditionalValidation` key MAY be used to opt into additional validation
+to be performed on the dataset beyond standard BIDS validation.
+The value of this field is either a string or an array of strings,
+each of which MUST be the name of a supported additional validation to be performed.
+
+The currently supported values are:
+
+| **Value**     | **Description**                                                                                                        |
+| ------------- | ---------------------------------------------------------------------------------------------------------------------- |
+| `"Phenotype"` | Stricter validation for tabular phenotypic data, as described in the [phenotype appendix](../appendices/phenotype.md). |
+
 ## `README`
 
 <!-- This block generates a file tree.
diff --git a/src/modality-agnostic-files/phenotypic-and-assessment-data.md b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
index d460b77975..023cecec20 100644
--- a/src/modality-agnostic-files/phenotypic-and-assessment-data.md
+++ b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
@@ -24,57 +24,18 @@ The files can include an arbitrary set of columns, but one of them MUST be
 `participant_id` and the entries of that column MUST correspond to the subjects
 in the BIDS dataset and `participants.tsv` file.
 
-!!! success "Guideline 2"
-
-    For [best tabular phenotypic data](../appendices/phenotype.md):
-    It is REQUIRED to aggregate all participant data into
-    one TSV per tabular phenotypic file.
-
-!!! success "Guideline 3"
-
-    For [best tabular phenotypic data](../appendices/phenotype.md):
-
-    Each measurement tool SHOULD have an independent
-    aggregated data TSV file in which the user collects all subjects, sessions,
-    and/or runs of data as one entry per row (with a row defined by
-    the smallest unit of acquisition). In other words:
-
-    1.  Each row MUST start with `participant_id`.
-    1.  Each TSV file MUST contain a `session_id` column when
-      multiple [sessions](../glossary.md#session-entities) are present
-      in the data set regardless of whether those sessions are in
-      the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
-    1.  If more than one of the same measurement tool is acquired within
-      the same `session_id`, a `run_id` column MUST be added.
-    1.  To encode the acquisition time for a measurement tool’s `session_id`,
-      add the `session_id` to the sessions file and
-      include the OPTIONAL `acq_time` column.
-
-    To see this guideline summarized as a table,
-    see [the appendix](../appendices/phenotype.md#to-summarize-this-guideline-as-a-table).
-
-    Furthermore, if you have to add a `session_id` column to the tabular phenotypic data,
-    you then MUST also introduce a session directory to the imaging data,
-    even if only one imaging session has been created.
-    This rule can be considered as "**if anyone uses sessions, everyone uses sessions**."
-    And vice versa, if imaging data has session directories,
-    all imaging data and tabular phenotypic data MUST have sessions.
-
-    This produces a file in which same-participant entries can take up as many rows as needed
-    according to the smallest unit of acquisition.
-    The combination of values in the `participant_id`, `session_id`, and `run_id` (if present)
-    columns MUST be unique for the entire tabular file.
+<!-- This block generates a columns table.
+The definitions of these fields can be found in
+  src/schema/rules/tabular_data/*.yaml
+and a guide for using macros can be found at
+ https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
+-->
+{{ MACROS___make_columns_table("modality_agnostic.Phenotypes") }}
 
 As with all other tabular data, the additional phenotypic information files
 MAY be accompanied by a JSON file describing the columns in detail
 (see [Tabular files](../common-principles.md#tabular-files)).
 
-!!! success "Guideline 1"
-
-    For [best tabular phenotypic data](../appendices/phenotype.md):
-    Each tabular phenotypic data TSV file MUST be accompanied by
-    a corresponding data dictionary JSON file.
-
 In addition to the column descriptions, the JSON file MAY contain the following fields:
 
 <!-- This block generates a metadata table.
@@ -120,13 +81,6 @@ Please note that in this example `MeasurementToolMetadata` includes information
 about the questionnaire and `adhd_b` and `adhd_c_dx` correspond to individual
 columns.
 
-!!! success "Guideline 4"
-
-    For [best tabular phenotypic data](../appendices/phenotype.md):
-    Whenever possible, it is RECOMMENDED to add `MeasurementToolMetadata` to
-    each `phenotype/<measurement_tool_name>.json` data dictionary.
-    This improves reusability and provides clarity about the measurement tool.
-
 In addition to the keys available to describe columns in all tabular files
 (`LongName`, `Description`, `Levels`, `Units`, and `TermURL`) the
 `participants.json` file as well as phenotypic files can also include column
@@ -134,3 +88,53 @@ descriptions with a `Derivative` field that, when set to true, indicates that
 values in the corresponding column is a transformation of values from other
 columns (for example a summary score based on a subset of items in a
 questionnaire).
+
+## Additional validation
+
+When the [`AdditionalValidation` key](dataset-description.md#additional-validation)
+contains `"Phenotype"` in the `dataset_description.json`,
+the following expectations apply to phenotypic and assessment data.
+
+1.  It is REQUIRED to aggregate all participant data into
+    one TSV per tabular phenotypic file.
+
+1.  Each tabular phenotypic data TSV file MUST be accompanied by
+    a corresponding data dictionary JSON file.
+
+1.  Whenever possible, it is RECOMMENDED to add `MeasurementToolMetadata` to
+    each `phenotype/<measurement_tool_name>.json` data dictionary.
+    This improves reusability and provides clarity about the measurement tool.
+
+1.  Each measurement tool SHOULD have an independent
+    aggregated data TSV file in which the user collects all subjects, sessions,
+    and/or runs of data as one entry per row (with a row defined by
+    the smallest unit of acquisition). In other words:
+
+    1.  Each row MUST start with `participant_id`.
+    1.  Each TSV file MUST contain a `session_id` column when
+        multiple [sessions](../glossary.md#session-entities) are present
+        in the data set regardless of whether those sessions are in
+        the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
+    1.  If more than one of the same measurement tool is acquired within
+        the same `session_id`, a `run_id` column MUST be added.
+    1.  To encode the acquisition time for a measurement tool’s `session_id`,
+        add the `session_id` to the sessions file and
+        include the OPTIONAL `acq_time` column.
+
+    To see this guideline summarized as a table,
+    see [the appendix](../appendices/phenotype.md#to-summarize-this-guideline-as-a-table).
+
+    Furthermore, if you have to add a `session_id` column to the tabular phenotypic data,
+    you then MUST also introduce a session directory to the imaging data,
+    even if only one imaging session has been created.
+    This rule can be considered as "**if anyone uses sessions, everyone uses sessions**."
+    And vice versa, if imaging data has session directories,
+    all imaging data and tabular phenotypic data MUST have sessions.
+
+    This produces a file in which same-participant entries can take up as many rows as needed
+    according to the smallest unit of acquisition.
+    The combination of values in the `participant_id`, `session_id`, and `run_id` (if present)
+    columns MUST be unique for the entire tabular file.
+
+To read more about the guidelines for tabular phenotypic data and examples,
+see the [Tabular phenotypic data guidelines appendix](../appendices/phenotype.md).
diff --git a/src/schema/objects/files.yaml b/src/schema/objects/files.yaml
index ed85f42635..658e53ce87 100644
--- a/src/schema/objects/files.yaml
+++ b/src/schema/objects/files.yaml
@@ -68,19 +68,23 @@ participants:
   display_name: Participant Information
   file_type: regular
   description: |
-    The purpose of this RECOMMENDED file is to describe unchanging properties of participants
+    The purpose of this RECOMMENDED file is to describe properties of participants
     such as sex, species, and strain.
     If this file exists, it MUST contain the column `participant_id`,
     which MUST consist of `sub-<label>` values identifying one row for each participant,
     followed by a list of optional columns describing participants.
-    Each participant MUST be described by one and only one row.
-    For participants with multiple sessions, see
-    the [sessions file](SPEC_ROOT/modality-agnostic-files/data-summary-files.md#sessions-file)
-    and [demographics file](SPEC_ROOT/modality-agnostic-files/data-summary-files.md#demographics-file) sections.
+    For participants with multiple sessions, the `session_id` column MAY be used
+    and, when present, MUST be the second column in the file.
+    Each pairing of `participant_id` and `session_id` MUST be unique.
+    When `session_id` is not in use, each participant MUST be described by one and only one row.
 
     The `participant_id` entries MUST be a superset of all subject directories
     and all `participant_id` entries found among phenotypic and assessment data
     in the `phenotype/` directory.
+    When in use, the `session_id` entries MUST be a superset of all session directories,
+    all tabular phenotypic `session_id` entries found among phenotypic and assessment data
+    in the `phenotype/` directory, and all `session_id` entries found in the
+    [sessions file(s)](SPEC_ROOT/modality-agnostic-files/data-summary-files.md#sessions-file).
 
     Commonly used *optional* columns in `participants.tsv` files are `age`, `sex`,
     `handedness`, `strain`, and `strain_rrid`.
diff --git a/src/schema/objects/metadata.yaml b/src/schema/objects/metadata.yaml
index 5540049f96..1d49af9480 100644
--- a/src/schema/objects/metadata.yaml
+++ b/src/schema/objects/metadata.yaml
@@ -48,6 +48,18 @@ AcquisitionVoxelSize:
     type: number
     exclusiveMinimum: 0
     unit: mm
+AdditionalValidation:
+  name: AdditionalValidation
+  display_name: Additional Validation
+  description: |
+    A string or list of strings of additional validations to be performed on the data,
+    chosen from among a pre-defined set. The currently allowed values are
+    only `"Phenotype"`.
+  anyOf:
+    - type: string
+    - type: array
+      items:
+        type: string
 Anaesthesia:
   name: Anaesthesia
   display_name: Anaesthesia
@@ -2253,6 +2265,7 @@ MeasurementToolMetadata:
     Contains two fields: `"Description"` and `"TermURL"`.
     `"Description"` is a free text description of the measurement tool.
     `"TermURL"` is a URL to an entity in an ontology corresponding to this tool.
+    RECOMMENDED by `AdditionalValidation` of `"Phenotype"` in `dataset_description.json`.
   type: object
   properties:
     TermURL:
diff --git a/src/schema/rules/dataset_metadata.yaml b/src/schema/rules/dataset_metadata.yaml
index b9c8e74761..0d9cc5cf1e 100644
--- a/src/schema/rules/dataset_metadata.yaml
+++ b/src/schema/rules/dataset_metadata.yaml
@@ -19,6 +19,7 @@ dataset_description:
     DatasetDOI: optional
     GeneratedBy: recommended
     SourceDatasets: recommended
+    AdditionalValidation: optional
 
 dataset_authors:
   selectors:
diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index 141420391f..e2c734d69d 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -9,6 +9,7 @@ Participants:
       level: required
       description_addendum: |
         There MUST be exactly one row for each participant.
+    session_id: recommended
     species: recommended
     age: recommended
     sex: recommended
@@ -16,7 +17,7 @@ Participants:
     strain: recommended
     strain_rrid: recommended
     HED: optional
-  index_columns: [participant_id]
+  index_columns: [participant_id, session_id]
   additional_columns: allowed
 
 Phenotypes:

From 6c6ee8bf36bd44e42c46df0177dc50d764afa46e Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Wed, 17 Sep 2025 05:08:59 -0700
Subject: [PATCH 23/53] Attempting to satisfy the CI and remark

---
 src/appendices/phenotype.md                                   | 2 +-
 src/modality-agnostic-files/phenotypic-and-assessment-data.md | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index ab05cdbcbd..911e85d0e5 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -6,7 +6,7 @@ for creating well-organized aggregated tabular phenotypic data.
 ## Guidelines
 
 These guidelines all apply when the
-[`AdditionalValidation` key](dataset-description.md#additional-validation)
+[`AdditionalValidation` key](../modality-agnostic-files/dataset-description.md#additional-validation)
 contains `"Phenotype"` in the `dataset_description.json`.
 They are intended to improve the organization and clarity of
 tabular phenotypic data like the participants file, sessions file,
diff --git a/src/modality-agnostic-files/phenotypic-and-assessment-data.md b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
index 023cecec20..2e33bcf462 100644
--- a/src/modality-agnostic-files/phenotypic-and-assessment-data.md
+++ b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
@@ -111,12 +111,15 @@ the following expectations apply to phenotypic and assessment data.
     the smallest unit of acquisition). In other words:
 
     1.  Each row MUST start with `participant_id`.
+
     1.  Each TSV file MUST contain a `session_id` column when
         multiple [sessions](../glossary.md#session-entities) are present
         in the data set regardless of whether those sessions are in
         the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
+
     1.  If more than one of the same measurement tool is acquired within
         the same `session_id`, a `run_id` column MUST be added.
+
     1.  To encode the acquisition time for a measurement tool’s `session_id`,
         add the `session_id` to the sessions file and
         include the OPTIONAL `acq_time` column.

From 8fa89bc5dc33180e20f415b8159f0060459e2765 Mon Sep 17 00:00:00 2001
From: Chris Markiewicz <effigies@gmail.com>
Date: Thu, 18 Sep 2025 15:50:23 -0400
Subject: [PATCH 24/53] Update
 src/schema/rules/tabular_data/modality_agnostic.yaml

---
 src/schema/rules/tabular_data/modality_agnostic.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index e2c734d69d..e98f01ca0b 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -22,6 +22,7 @@ Participants:
 
 Phenotypes:
   selectors:
+    - datatype == "phenotype"
     - extension == ".tsv"
   initial_columns:
     - participant_id

From ff8666945a746cbc7f2ffa93202f9c0d2df98244 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Thu, 18 Sep 2025 13:02:13 -0700
Subject: [PATCH 25/53] Update columns.yaml

- Add acq_time__phenotype to columns.yaml
- Update modality_agnostic.yaml Phenotypes section to resolve schemacode_ci problems
---
 src/schema/objects/columns.yaml                   |  9 +++++++++
 .../rules/tabular_data/modality_agnostic.yaml     | 15 ++++++---------
 2 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/src/schema/objects/columns.yaml b/src/schema/objects/columns.yaml
index bbfb2882a0..63824fe9f6 100644
--- a/src/schema/objects/columns.yaml
+++ b/src/schema/objects/columns.yaml
@@ -12,6 +12,15 @@ abbreviation:
   description: |
     The unique label abbreviation
   type: string
+acq_time__phenotype:
+  name: acq_time
+  display_name: Phenotypic and assessment data acquisition time
+  description: |
+    Acquisition time refers to when the first data point in each run was acquired.
+    Datetime format and their deidentification are described in
+    [Units](SPEC_ROOT/common-principles.md#units).
+  type: string
+  format: datetime
 acq_time__scans:
   name: acq_time
   display_name: Scan acquisition time
diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index e98f01ca0b..1388b76843 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -29,16 +29,14 @@ Phenotypes:
   columns:
     participant_id:
       level: required
-      description: |
-        MUST be the first column in the file.
+      description_addendum: |
         Note that data for one participant MAY be represented across multiple rows
         in case of multiple sessions or runs, and
         therefore the entry in the `participant_id` column will be repeated.
     session_id:
       level: optional
       level_addendum: required if sessions are defined in the dataset
-      description: |
-        REQUIRED if sessions are defined in the dataset.
+      description_addendum: |
         A `session_id` column MUST be added to all tabular files in the phenotype directory
         as soon as multiple sessions are present in the data set
         regardless of whether those sessions are in the
@@ -46,17 +44,16 @@ Phenotypes:
     run_id:
       level: optional
       level_addendum: required if there are multiple runs within any session
-      description: |
-        REQUIRED if there are multiple runs within any session.
+      description_addendum: |
         A chronological `run` number is used when
         a measurement tool or assessment described by a tabular file
         was repeated within a session.
-    acq_time__scans:
+    acq_time__phenotype:
       level: optional
-      description: |
+      description_addendum: |
         If acquisition time is available, the `acq_time` column CAN be used
         to record the time of acquisition of each row in the tabular file.
-  index_columns: [participant_id, session_id, run_id, acq_time__scans]
+  index_columns: [participant_id, session_id, run_id, acq_time__phenotype]
   additional_columns: allowed
 
 Samples:

From ede68ef86b184bb09a4489768938dd7bb394c970 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Thu, 18 Sep 2025 13:16:21 -0700
Subject: [PATCH 26/53] Update modality_agnostic.yaml

Add initial column orders for Participants and Phenotypes and remove participant_id description_addendum in the Participants section.
---
 src/schema/rules/tabular_data/modality_agnostic.yaml | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index 1388b76843..bf84c03ebe 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -4,11 +4,10 @@ Participants:
     - path == "/participants.tsv"
   initial_columns:
     - participant_id
+    - session_id
   columns:
     participant_id:
       level: required
-      description_addendum: |
-        There MUST be exactly one row for each participant.
     session_id: recommended
     species: recommended
     age: recommended
@@ -26,6 +25,8 @@ Phenotypes:
     - extension == ".tsv"
   initial_columns:
     - participant_id
+    - session_id
+    - run_id
   columns:
     participant_id:
       level: required
@@ -53,7 +54,7 @@ Phenotypes:
       description_addendum: |
         If acquisition time is available, the `acq_time` column CAN be used
         to record the time of acquisition of each row in the tabular file.
-  index_columns: [participant_id, session_id, run_id, acq_time__phenotype]
+  index_columns: [participant_id, session_id, run_id]
   additional_columns: allowed
 
 Samples:

From e8ab5dde84eb7ad8f19d5a5628c58abcc1c4d1b1 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Mon, 22 Sep 2025 10:46:31 -0700
Subject: [PATCH 27/53] Update phenotype.md appendix examples, participants
 schema, and other related files

Attempt to address GitHub comments from Arshitha and apply correct filetree macros.
---
 src/appendices/phenotype.md                   | 236 +++++++++++-------
 .../data-summary-files.md                     |   6 +-
 .../phenotypic-and-assessment-data.md         |  13 +-
 src/schema/objects/files.yaml                 |   5 +-
 4 files changed, 161 insertions(+), 99 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index 911e85d0e5..3245c3d5d5 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -132,16 +132,26 @@ What follows are a few common use case examples for tabular phenotypic files.
 
 File tree
 
-```Text
-phenotype/
-    <measurement_tool_name>.json
-    <measurement_tool_name>.tsv
-sub-01/anat/
-    sub-01_T1w.json
-    sub-01_T1w.nii.gz
-```
-
-Contents of `phenotype/<measurement_tool_name>.tsv`
+<!-- This block generates a file tree.
+A guide for using macros can be found at
+ https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
+-->
+{{ MACROS___make_filetree_example(
+   {
+   "phenotype": {
+      "measurement_tool.json": "",
+      "measurement_tool.tsv": "",
+      },
+   "sub-01": {
+      "anat": {
+         "sub-01_T1w.json": "",
+         "sub-01_T1w.nii.gz": "",
+         }
+      }
+   }
+) }}
+
+Contents of `phenotype/measurement_tool.tsv`
 
 ```tsv
 participant_id	measurement_1	measurement_2
@@ -162,16 +172,28 @@ of prepared data following these guidelines.
 
 File tree
 
-```Text
-phenotype/
-    <measurement_tool_name>.json
-    <measurement_tool_name>.tsv
-sub-01/ses-MRI/anat/
-    sub-01_ses-MRI_T1w.json
-    sub-01_ses-MRI_T1w.nii.gz
-```
-
-Contents of `phenotype/<measurement_tool_name>.tsv`
+<!-- This block generates a file tree.
+A guide for using macros can be found at
+ https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
+-->
+{{ MACROS___make_filetree_example(
+   {
+   "phenotype": {
+      "measurement_tool.json": "",
+      "measurement_tool.tsv": "",
+      },
+   "sub-01": {
+      "ses-MRI": {
+         "anat": {
+            "sub-01_ses-MRI_T1w.json": "",
+            "sub-01_ses-MRI_T1w.nii.gz": "",
+            }
+         }
+      }
+   }
+) }}
+
+Contents of `phenotype/measurement_tool.tsv`
 
 ```tsv
 participant_id	session_id	measurement_1	measurement_2
@@ -182,16 +204,26 @@ sub-01	ses-pheno	value1	value2
 
 File tree
 
-```Text
-phenotype/
-    <measurement_tool_name>.json
-    <measurement_tool_name>.tsv
-sub-01/anat/
-    sub-01_T1w.json
-    sub-01_T1w.nii.gz
-```
-
-Contents of `phenotype/<measurement_tool_name>.tsv`
+<!-- This block generates a file tree.
+A guide for using macros can be found at
+ https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
+-->
+{{ MACROS___make_filetree_example(
+   {
+   "phenotype": {
+      "measurement_tool.json": "",
+      "measurement_tool.tsv": "",
+      },
+   "sub-01": {
+      "anat": {
+         "sub-01_T1w.json": "",
+         "sub-01_T1w.nii.gz": "",
+         }
+      }
+   }
+) }}
+
+Contents of `phenotype/measurement_tool.tsv`
 
 ```tsv
 participant_id	measurement_1	measurement_2
@@ -199,7 +231,7 @@ sub-01	value1	value2
 ```
 
 A session directory **MUST** be present in the participant directory and
-the `session_id` column **MUST** be present in `<measurement_tool_name>.tsv` as well.
+the `session_id` column **MUST** be present in `phenotype/measurement_tool.tsv` as well.
 Sessions must be used consistently for the combination of tabular and
 non-tabular phenotypic data.
 
@@ -207,27 +239,42 @@ non-tabular phenotypic data.
 
 File tree
 
-```Text
-phenotype/
-    <measurement_tool_name>.json
-    <measurement_tool_name>.tsv
-sub-01/
-    ses-MRI1/
-        anat/
-            sub-01_ses-MRI1_T1w.json
-            sub-01_ses-MRI1_T1w.nii.gz
-    ses-MRI2/
-        anat/
-            sub-01_ses-MRI2_T1w.json
-            sub-01_ses-MRI2_T1w.nii.gz
-sub-02/
-    ses-MRI1/
-        anat/
-            sub-02_ses-MRI1_T1w.json
-            sub-02_ses-MRI1_T1w.nii.gz
-```
-
-Contents of `phenotype/<measurement_tool_name>.tsv`
+<!-- This block generates a file tree.
+A guide for using macros can be found at
+ https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
+-->
+{{ MACROS___make_filetree_example(
+   {
+   "phenotype": {
+      "measurement_tool.json": "",
+      "measurement_tool.tsv": "",
+      },
+   "sub-01": {
+      "ses-MRI1": {
+         "anat": {
+            "sub-01_ses-MRI1_T1w.json": "",
+            "sub-01_ses-MRI1_T1w.nii.gz": "",
+            }
+         },
+      "ses-MRI2": {
+         "anat": {
+            "sub-01_ses-MRI2_T1w.json": "",
+            "sub-01_ses-MRI2_T1w.nii.gz": "",
+            }
+         }
+      },
+   "sub-02": {
+      "ses-MRI1": {
+         "anat": {
+            "sub-02_ses-MRI1_T1w.json": "",
+            "sub-02_ses-MRI1_T1w.nii.gz": "",
+            }
+         }
+      }
+   }
+) }}
+
+Contents of `phenotype/measurement_tool.tsv`
 
 ```tsv
 participant_id	session_id	measurement_1	measurement_2
@@ -242,23 +289,46 @@ The `ses-baseline` session collects an MRI and tabular phenotypic data.
 
 File tree
 
-```Text
-participants.json
-participants.tsv
-sessions.json
-sessions.tsv
-phenotype/
-    demographics.json
-    demographics.tsv
-    ...
-sub-01/
-    ses-baseline/
-    ses-followupMRI/
-sub-02/
-    ses-baseline/
-sub-03/
-    ses-baseline/
-    ses-followupMRI/
+<!-- This block generates a file tree.
+A guide for using macros can be found at
+ https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
+-->
+{{ MACROS___make_filetree_example(
+   {
+   "participants.json": "",
+   "participants.tsv": "",
+   "sessions.json": "",
+   "sessions.tsv": "",
+   "phenotype": {
+      "survey.json": "",
+      "survey.tsv": "",
+      },
+   "sub-01": {
+      "ses-baseline/": "",
+      "ses-followupMRI/": "",
+      },
+   "sub-02": {
+      "ses-baseline/": "",
+      },
+   "sub-03": {
+      "ses-baseline/": "",
+      "ses-followupMRI/": "",
+      }
+   }
+) }}
+
+Contents of `participants.tsv`. Participant properties that can change
+from session to session belong here especially.
+
+```tsv
+participant_id	session_id	sex	age	gender	race	household_income
+sub-01	ses-baseline	M	10	3	4	5
+sub-01	ses-followupMRI	M	10	3	4	5
+sub-01	ses-interview	M	11	4	4	6
+sub-02	ses-baseline	F	9	1	3	3
+sub-02	ses-interview	F	10	1	7	3
+sub-03	ses-baseline	F	11	2	10	4
+sub-03	ses-followupMRI	F	12	5	10	4
 ```
 
 Contents of `sessions.tsv`.
@@ -295,27 +365,17 @@ Contents of `sessions.json`. Note how the `session_id` `Levels` are clearly desc
 }
 ```
 
-Contents of `participants.tsv`.
-
-```tsv
-participant_id	sex
-sub-01	M
-sub-02	F
-sub-03	F
-```
-
-Contents of `phenotype/demographics.tsv`. Measures or features that can change
-from session to session belong here especially.
+Contents of `phenotype/survey.tsv`. Note how `sub-03` does not have
+a row for `ses-interview` because that session was not collected
+and is absent above in the `participants.tsv` and `sessions.tsv` files.
 
 ```tsv
-participant_id	session_id	age	gender	race	household_income
-sub-01	ses-baseline	10	3	4	5
-sub-01	ses-followupMRI	10	3	4	5
-sub-01	ses-interview	11	4	4	6
-sub-02	ses-baseline	9	1	3	3
-sub-02	ses-interview	10	1	7	3
-sub-03	ses-baseline	11	2	10	4
-sub-03	ses-followupMRI	12	5	10	4
+participant_id	session_id	question_1	question_2	question_3
+sub-01	ses-baseline	A	2	no
+sub-01	ses-interview	A	3	yes
+sub-02	ses-baseline	A	2	no
+sub-02	ses-interview	B	1	unsure
+sub-03	ses-baseline	B	3	no
 ```
 
 For more complete examples, see the `pheno00*`
diff --git a/src/modality-agnostic-files/data-summary-files.md b/src/modality-agnostic-files/data-summary-files.md
index a10b5782b9..a95fe890a1 100644
--- a/src/modality-agnostic-files/data-summary-files.md
+++ b/src/modality-agnostic-files/data-summary-files.md
@@ -312,9 +312,9 @@ the following expectations apply to sessions files.
     a `session_id` column. The data dictionary JSON file's `session_id` field
     MUST include `Levels` with the description of each `session_id`.
 
-1.  When a sessions file is in use, you MUST NOT provide additional sessions
-    files at the participant-level which would otherwise use
-    the inheritance principle.
+1.  When a root-level sessions file is in use, you MUST NOT provide
+    additional sessions files at the participant-level
+    which would otherwise use the inheritance principle.
 
 1.  Whenever possible, it is RECOMMENDED to also collect acquisition time
     for tabular phenotypic data and store the time of acquisition of each row
diff --git a/src/modality-agnostic-files/phenotypic-and-assessment-data.md b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
index 2e33bcf462..59b623399f 100644
--- a/src/modality-agnostic-files/phenotypic-and-assessment-data.md
+++ b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
@@ -32,9 +32,13 @@ and a guide for using macros can be found at
 -->
 {{ MACROS___make_columns_table("modality_agnostic.Phenotypes") }}
 
-As with all other tabular data, the additional phenotypic information files
-MAY be accompanied by a JSON file describing the columns in detail
+As with all other tabular data, the additional tabular phenotypic data
+MAY be accompanied by a JSON data dictionary file describing the columns in detail
 (see [Tabular files](../common-principles.md#tabular-files)).
+When the [`AdditionalValidation` key](dataset-description.md#additional-validation)
+contains `"Phenotype"` in the `dataset_description.json`,
+then the additional tabular phenotypic data
+MUST be accompanied by a JSON data dictionary file.
 
 In addition to the column descriptions, the JSON file MAY contain the following fields:
 
@@ -116,11 +120,12 @@ the following expectations apply to phenotypic and assessment data.
         multiple [sessions](../glossary.md#session-entities) are present
         in the data set regardless of whether those sessions are in
         the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
+        See the first two examples in [the appendix](../appendices/phenotype.md).
 
     1.  If more than one of the same measurement tool is acquired within
         the same `session_id`, a `run_id` column MUST be added.
 
-    1.  To encode the acquisition time for a measurement tool’s `session_id`,
+    1.  To encode the acquisition time for a tabular phenotypic file’s `session_id`,
         add the `session_id` to the sessions file and
         include the OPTIONAL `acq_time` column.
 
@@ -130,7 +135,7 @@ the following expectations apply to phenotypic and assessment data.
     Furthermore, if you have to add a `session_id` column to the tabular phenotypic data,
     you then MUST also introduce a session directory to the imaging data,
     even if only one imaging session has been created.
-    This rule can be considered as "**if anyone uses sessions, everyone uses sessions**."
+    This rule can be considered as "**if anyone uses sessions, everyone uses sessions.**"
     And vice versa, if imaging data has session directories,
     all imaging data and tabular phenotypic data MUST have sessions.
 
diff --git a/src/schema/objects/files.yaml b/src/schema/objects/files.yaml
index 658e53ce87..c3c36a0393 100644
--- a/src/schema/objects/files.yaml
+++ b/src/schema/objects/files.yaml
@@ -73,10 +73,7 @@ participants:
     If this file exists, it MUST contain the column `participant_id`,
     which MUST consist of `sub-<label>` values identifying one row for each participant,
     followed by a list of optional columns describing participants.
-    For participants with multiple sessions, the `session_id` column MAY be used
-    and, when present, MUST be the second column in the file.
-    Each pairing of `participant_id` and `session_id` MUST be unique.
-    When `session_id` is not in use, each participant MUST be described by one and only one row.
+    For participants with multiple sessions, the `session_id` column is RECOMMENDED.
 
     The `participant_id` entries MUST be a superset of all subject directories
     and all `participant_id` entries found among phenotypic and assessment data

From 3490e9da9b09724f7e679203bc105e0c513908f1 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Tue, 23 Sep 2025 06:37:49 -0700
Subject: [PATCH 28/53] Update modality_agnostic.yaml

Trying to make the dev validator happier with the the participant_id column optionally first in sessions files.
---
 src/schema/rules/tabular_data/modality_agnostic.yaml | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index bf84c03ebe..1fa4eb8d4a 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -90,8 +90,10 @@ Sessions:
     - suffix == "sessions"
     - extension == ".tsv"
   initial_columns:
+    - participant_id
     - session_id
   columns:
+    participant_id: optional
     session_id:
       level: required
       description_addendum: |
@@ -99,7 +101,7 @@ Sessions:
     acq_time__sessions: optional
     pathology: recommended
     HED: optional
-  index_columns: [session_id]
+  index_columns: [participant_id, session_id]
   additional_columns: allowed
 
 Phenotype:

From d02e0bfc6bedb19a6f376338c6f3c1429a246b48 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Tue, 23 Sep 2025 06:44:01 -0700
Subject: [PATCH 29/53] Update modality_agnostic.yaml

Missed a column for the Sessions file: run_id.
---
 src/schema/rules/tabular_data/modality_agnostic.yaml | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index 1fa4eb8d4a..1d079b0eb7 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -92,16 +92,18 @@ Sessions:
   initial_columns:
     - participant_id
     - session_id
+    - run_id
   columns:
     participant_id: optional
     session_id:
       level: required
       description_addendum: |
         There MUST be exactly one row for each session.
+    run_id: optional
     acq_time__sessions: optional
     pathology: recommended
     HED: optional
-  index_columns: [participant_id, session_id]
+  index_columns: [participant_id, session_id, run_id]
   additional_columns: allowed
 
 Phenotype:

From f8d633371f78e1ccdc682712170bb140beb7e2f7 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Tue, 23 Sep 2025 08:29:06 -0700
Subject: [PATCH 30/53] Update modality_agnostic.yaml

Missed the session_id column being 2nd for Phenotype.
---
 src/schema/rules/tabular_data/modality_agnostic.yaml | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index 1d079b0eb7..7be753c8a7 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -111,8 +111,10 @@ Phenotype:
     - datatype == 'phenotype'
   initial_columns:
     - participant_id
+    - session_id
   columns:
     participant_id: required
+    session_id: optional
     HED: optional
-  index_columns: [participant_id]
+  index_columns: [participant_id, session_id]
   additional_columns: allowed

From bd083c0625a7e43750b6b7b99baa5a320b60fc53 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Wed, 24 Sep 2025 10:51:52 -0700
Subject: [PATCH 31/53] Update phenotype appendix

- Added in a new guideline 7 to encourage the use of participants and sessions files for different uses.
- Re-numbered old guidelines 7-9 to 8-10.
---
 src/appendices/phenotype.md | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index 3245c3d5d5..b5454f3617 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -85,19 +85,32 @@ the `Units` of `age` or the accuracy of `age` data.
 ### 6. Use the sessions file at the root-level
 
 If there is more than one session for any one participant, then
-it is REQUIRED to provide a sessions file at the dataset root.
+it is RECOMMENDED to provide a sessions file at the dataset root.
 The sessions file MUST list all sessions for all subjects across
 imaging and tabular phenotypic data.
 If a sessions file is provided, then it MUST begin with a `participant_id` column
 followed immediately by a `session_id` column. The data dictionary JSON file’s
 `session_id` field MUST include `Levels` with the description of each `session_id`.
 
-### 7. Use either root-level sessions file or participant-level sessions files
+
+### 7. Record participant properties in the participants file and session properties in the sessions file
+
+Since the same `participant_id` and `session_id` columns can be used
+similarly in the participants file and the sessions file,
+use the two different files to instead differentiate
+properties of participants versus sessions.
+Properties of participants MAY include things like
+age, sex, race, or household income.
+Properties of sessions MAY include things like
+acquisition time, measurement device properties,
+and indoor or outdoor experimental conditions.
+
+### 8. Use either root-level sessions file or participant-level sessions files
 
 When a sessions file is in use, you MUST NOT provide additional sessions files
 at the participant-level which would otherwise use the inheritance principle.
 
-### 8. Record acquisition time of sessions with `acq_time`
+### 9. Record acquisition time of sessions with `acq_time`
 
 Whenever possible, it is RECOMMENDED to also collect acquisition time for
 tabular phenotypic data and store the time of acquisition[^2] of each row
@@ -105,7 +118,7 @@ inside a column named `acq_time` in the sessions file.
 This is consistent with how acquisition time is recorded for MRI data
 and other time-sensitive measurements (for example systolic blood pressure).
 
-### 9. Respect participant privacy when recording acquisition times
+### 10. Respect participant privacy when recording acquisition times
 
 When needed to preserve participant privacy, you SHOULD record
 relative acquisition times with respect to the earliest session.
@@ -122,7 +135,7 @@ A short summary table here describes when to use which files.
 | :----------------------------- | :------------------ | :-------------------- |
 | Participants                   | RECOMMENDED         | RECOMMENDED           |
 | Phenotypic and assessment data | RECOMMENDED         | RECOMMENDED           |
-| Sessions                       | OPTIONAL            | REQUIRED              |
+| Sessions                       | OPTIONAL            | RECOMMENDED           |
 
 ## Examples
 

From ec2703b137812bc9041bc2462aba74e3f7124d25 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Wed, 24 Sep 2025 14:23:09 -0700
Subject: [PATCH 32/53] Update phenotype appendix

Removing excess line I forgot to remove earlier. Thanks remark CI!
---
 src/appendices/phenotype.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index b5454f3617..5946a26644 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -92,7 +92,6 @@ If a sessions file is provided, then it MUST begin with a `participant_id` colum
 followed immediately by a `session_id` column. The data dictionary JSON file’s
 `session_id` field MUST include `Levels` with the description of each `session_id`.
 
-
 ### 7. Record participant properties in the participants file and session properties in the sessions file
 
 Since the same `participant_id` and `session_id` columns can be used

From 00d8f25998a70fb0d6cc5aaf5576060527e1f1dd Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Thu, 25 Sep 2025 08:55:19 -0700
Subject: [PATCH 33/53] Delete .vscode/settings.json

Accidental file.
---
 .vscode/settings.json | 5 -----
 1 file changed, 5 deletions(-)
 delete mode 100644 .vscode/settings.json

diff --git a/.vscode/settings.json b/.vscode/settings.json
deleted file mode 100644
index 3a704128f1..0000000000
--- a/.vscode/settings.json
+++ /dev/null
@@ -1,5 +0,0 @@
-{
-    "githubPullRequests.ignoredPullRequestBranches": [
-        "master"
-    ]
-}

From 41f0f70161a64c5d2958f16e59df206db92e14a3 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Thu, 25 Sep 2025 09:22:03 -0700
Subject: [PATCH 34/53] Apply suggestions from code review

Added in easily-agreeable suggestions in a batch.

Co-authored-by: Sebastian Urchs <surchs@users.noreply.github.com>
---
 src/appendices/phenotype.md                   | 25 +++++++++++--------
 .../phenotypic-and-assessment-data.md         |  2 +-
 src/schema/objects/files.yaml                 |  2 +-
 3 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index 5946a26644..b028e34c13 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -1,21 +1,26 @@
 # Tabular phenotypic data guidelines
 
 This appendix is a collection of guidelines and examples
-for creating well-organized aggregated tabular phenotypic data.
+for curating well-organized tabular phenotypic data.
 
 ## Guidelines
 
-These guidelines all apply when the
-[`AdditionalValidation` key](../modality-agnostic-files/dataset-description.md#additional-validation)
-contains `"Phenotype"` in the `dataset_description.json`.
-They are intended to improve the organization and clarity of
+These guidelines are intended to improve the organization and clarity of
 tabular phenotypic data like the participants file, sessions file,
 and phenotypic and assessment data.
 
+They are recommendations and are by default ignored during validation.
+You can make them mandatory during validation by setting the
+[`AdditionalValidation` key](../modality-agnostic-files/dataset-description.md#additional-validation)
+to `"Phenotype"` in the `dataset_description.json`.
+
+
 ### 1. Aggregate data across sessions
 
-Aggregation refers to the contents of the TSV file. It is REQUIRED
-to collect all participant data into one TSV per tabular phenotypic file.
+Aggregate participant information across all sessions into one tabular TSV file per 
+measurement or phenotypic assessment and store this file in the `/phenotype` directory.
+Demographic information is a special case and  MUST be aggregated 
+in the `participants.tsv` file at the root level of the dataset.
 
 ### 2. Always pair tabular data with data dictionaries
 
@@ -46,9 +51,9 @@ the smallest unit of acquisition). In other words:
 1.  If more than one of the same measurement tool is acquired within
     the same `session_id`, a `run_id` column MUST be added.
 
-1.  To encode the acquisition time for a measurement tool’s `session_id`,
-    add the `session_id` to the sessions file and
-    include the OPTIONAL `acq_time` column.
+1.  Encoding  the acquisition time for a measurement tool’s `session_id`,
+     is RECOMMENDED. This information MUST be stored in the `sessions.tsv`
+     file at the root level of the dataset in the `acq_time` column.
 
 #### To summarize this guideline as a table
 
diff --git a/src/modality-agnostic-files/phenotypic-and-assessment-data.md b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
index 59b623399f..fb93c5561a 100644
--- a/src/modality-agnostic-files/phenotypic-and-assessment-data.md
+++ b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
@@ -18,7 +18,7 @@ Each of the measurement files MUST be kept in a `/phenotype` directory placed
 at the root of the BIDS dataset and MUST end with the `.tsv` extension.
 Filenames SHOULD be chosen to reflect the contents of the file.
 For example, the "Adult ADHD Clinical Diagnostic Scale" could be saved in a file
-called `phenotype/acds_adult.tsv`.
+called `/phenotype/acds_adult.tsv`.
 
 The files can include an arbitrary set of columns, but one of them MUST be
 `participant_id` and the entries of that column MUST correspond to the subjects
diff --git a/src/schema/objects/files.yaml b/src/schema/objects/files.yaml
index c3c36a0393..9fa03d972d 100644
--- a/src/schema/objects/files.yaml
+++ b/src/schema/objects/files.yaml
@@ -69,7 +69,7 @@ participants:
   file_type: regular
   description: |
     The purpose of this RECOMMENDED file is to describe properties of participants
-    such as sex, species, and strain.
+    such as age, sex, handedness, species, and strain.
     If this file exists, it MUST contain the column `participant_id`,
     which MUST consist of `sub-<label>` values identifying one row for each participant,
     followed by a list of optional columns describing participants.

From cdfc0d23d94d8093820e342abf39667d14ecac04 Mon Sep 17 00:00:00 2001
From: "pre-commit-ci[bot]"
 <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Date: Thu, 25 Sep 2025 16:22:25 +0000
Subject: [PATCH 35/53] [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci
---
 src/appendices/phenotype.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index b028e34c13..9829e76db2 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -17,9 +17,9 @@ to `"Phenotype"` in the `dataset_description.json`.
 
 ### 1. Aggregate data across sessions
 
-Aggregate participant information across all sessions into one tabular TSV file per 
+Aggregate participant information across all sessions into one tabular TSV file per
 measurement or phenotypic assessment and store this file in the `/phenotype` directory.
-Demographic information is a special case and  MUST be aggregated 
+Demographic information is a special case and  MUST be aggregated
 in the `participants.tsv` file at the root level of the dataset.
 
 ### 2. Always pair tabular data with data dictionaries

From 6cbb4eea2bb200bd93c349c101b93c0e1975ea42 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Mon, 29 Sep 2025 10:44:35 -0700
Subject: [PATCH 36/53] Update BEP036 files more

Attempt to address more of @surchs comments.
---
 src/appendices/phenotype.md                   |  83 +++++++-----
 .../data-summary-files.md                     | 121 ++++++++++--------
 .../rules/tabular_data/modality_agnostic.yaml |  10 +-
 3 files changed, 119 insertions(+), 95 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index 9829e76db2..7810e791c0 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -27,12 +27,14 @@ in the `participants.tsv` file at the root level of the dataset.
 Tabular phenotypic data MUST be prepared as one pair of a tabular file
 in tab-separated value (TSV) format and a corresponding data dictionary
 in JavaScript Object Notation (JSON) format.
+See the [Tabular files section](../common-principles.md#tabular-files) for more information.
 
 ### 3. Add `MeasurementToolMetadata` to each tabular phenotypic measurement tool
 
 Whenever possible, it is RECOMMENDED to add `MeasurementToolMetadata` to
 each `phenotype/<measurement_tool_name>.json` data dictionary.
 This improves reusability and provides clarity about the measurement tool.
+See [`MeasurementToolMetadata` in the glossary](../glossary.md#measurementtoolmetadata-metadata) for more.
 
 ### 4. Ensure minimal annotation for phenotypic and assessment data
 
@@ -41,22 +43,20 @@ aggregated data TSV file in which the user collects all subjects, sessions,
 and/or runs of data as one entry per row (with a row defined by
 the smallest unit of acquisition). In other words:
 
-1.  Each row MUST start with `participant_id`.
+-   Each row MUST start with `participant_id`.
 
-1.  Each TSV file MUST contain a `session_id` column when
-    multiple [sessions](../glossary.md#session-entities)[^1] are present
+-   Each TSV file MUST contain a `session_id` column when
+    multiple [sessions](../glossary.md#session-entities)[<sup>1</sup>](#footnotes) are present
     in the data set regardless of whether those sessions are in
     the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
 
-1.  If more than one of the same measurement tool is acquired within
+-   If more than one of the same measurement tool is acquired within
     the same `session_id`, a `run_id` column MUST be added.
 
-1.  Encoding  the acquisition time for a measurement tool’s `session_id`,
+-   Encoding  the acquisition time for a measurement tool’s `session_id`,
      is RECOMMENDED. This information MUST be stored in the `sessions.tsv`
      file at the root level of the dataset in the `acq_time` column.
 
-#### To summarize this guideline as a table
-
 <!-- This block generates a columns table.
 The definitions of these fields can be found in
   src/schema/rules/tabular_data/*.yaml
@@ -65,17 +65,7 @@ and a guide for using macros can be found at
 -->
 {{ MACROS___make_columns_table("modality_agnostic.Phenotypes") }}
 
-Furthermore, if you have to add a `session_id` column to the
-tabular phenotypic data, you then MUST also introduce a session directory to the
-imaging data, even if only one imaging session has been created.
 This rule can be considered as "**if anyone uses sessions, everyone uses sessions**."
-And vice versa, if imaging data has session directories,
-all imaging data and tabular phenotypic data MUST have sessions.
-
-This produces a file in which same-participant entries can take up as many rows
-as needed according to the smallest unit of acquisition.
-The combination of values in the `participant_id`, `session_id`, and `run_id` (if present)
-columns MUST be unique for the entire tabular file.
 
 ### 5. Store longitudinal age in the participants file
 
@@ -109,16 +99,15 @@ Properties of sessions MAY include things like
 acquisition time, measurement device properties,
 and indoor or outdoor experimental conditions.
 
-### 8. Use either root-level sessions file or participant-level sessions files
+### 8. Use either root-level sessions file or participant-level sessions files, but not both
 
 When a sessions file is in use, you MUST NOT provide additional sessions files
 at the participant-level which would otherwise use the inheritance principle.
 
-### 9. Record acquisition time of sessions with `acq_time`
+### 9. Record acquisition time of all sessions with `acq_time`
 
-Whenever possible, it is RECOMMENDED to also collect acquisition time for
-tabular phenotypic data and store the time of acquisition[^2] of each row
-inside a column named `acq_time` in the sessions file.
+It is RECOMMENDED to store acquisition time[<sup>2</sup>](#footnotes)
+for tabular phenotypic data in the sessions file in a column named `acq_time`.
 This is consistent with how acquisition time is recorded for MRI data
 and other time-sensitive measurements (for example systolic blood pressure).
 
@@ -133,13 +122,10 @@ using the `acq_time` column.
 ## Summary
 
 This appendix described guidelines for best tabular phenotypic data.
-A short summary table here describes when to use which files.
-
-| File                           | Single session data | Multiple session data |
-| :----------------------------- | :------------------ | :-------------------- |
-| Participants                   | RECOMMENDED         | RECOMMENDED           |
-| Phenotypic and assessment data | RECOMMENDED         | RECOMMENDED           |
-| Sessions                       | OPTIONAL            | RECOMMENDED           |
+In summary, it is RECOMMENDED to always use the participants file
+and separate files by assessment in the `/phenotype/` directory,
+since they each collect different information.
+If you use sessions, then the sessions file is also RECOMMENDED.
 
 ## Examples
 
@@ -195,6 +181,8 @@ A guide for using macros can be found at
 -->
 {{ MACROS___make_filetree_example(
    {
+   "sessions.json": "",
+   "sessions.tsv": "",
    "phenotype": {
       "measurement_tool.json": "",
       "measurement_tool.tsv": "",
@@ -210,6 +198,14 @@ A guide for using macros can be found at
    }
 ) }}
 
+Contents of `sessions.tsv`
+
+```tsv
+participant_id	session_id	acq_time
+sub-01	ses-pheno	2001-01-01T12:05:00
+sub-01	ses-MRI	2001-03-01T13:14:00
+```
+
 Contents of `phenotype/measurement_tool.tsv`
 
 ```tsv
@@ -254,6 +250,11 @@ non-tabular phenotypic data.
 
 ### 2 participants with a mix of tabular phenotypic data and imaging sessions
 
+In this example, participants acquired both
+a phenotypic measurement tool and an MRI during `ses-MRI1`.
+`sub-01` has a `ses-MRI2` with no phenotypic measurement tool acquired
+and `sub-02` has a `ses-pheno` where no MRI was acquired.
+
 File tree
 
 <!-- This block generates a file tree.
@@ -262,6 +263,8 @@ A guide for using macros can be found at
 -->
 {{ MACROS___make_filetree_example(
    {
+   "sessions.json": "",
+   "sessions.tsv": "",
    "phenotype": {
       "measurement_tool.json": "",
       "measurement_tool.tsv": "",
@@ -291,13 +294,23 @@ A guide for using macros can be found at
    }
 ) }}
 
+Contents of `sessions.tsv`
+
+```tsv
+participant_id	session_id	acq_time
+sub-01	ses-MRI1	2001-01-01T11:12:00
+sub-01	ses-MRI2	2001-07-01T13:14:00
+sub-02	ses-MRI1	2001-01-181T15:16:00
+sub-02	ses-pheno	2001-02-20T12:05:00
+```
+
 Contents of `phenotype/measurement_tool.tsv`
 
 ```tsv
 participant_id	session_id	measurement_1	measurement_2
-sub-01	ses-pheno1	value1	value2
-sub-02	ses-pheno1	value3	value4
-sub-02	ses-pheno2	value5	value6
+sub-01	ses-MRI1	value1	value2
+sub-02	ses-MRI1	value3	value4
+sub-02	ses-pheno	value5	value6
 ```
 
 ### 3 participants with 3 different kinds of sessions among them
@@ -398,12 +411,14 @@ sub-03	ses-baseline	B	3	no
 For more complete examples, see the `pheno00*`
 [bids-examples on GitHub](https://github.com/bids-standard/bids-examples/).
 
-[^1]: A session is any logical grouping of imaging and behavioral data consistent
+## Footnotes
+
+<sup>1</sup> A session is any logical grouping of imaging and behavioral data consistent
 across participants. Session can (but doesn't have to) be synonymous to a visit
 in a longitudinal study. In situations where different data types are obtained over
 several visits (for example fMRI on one day followed by DWI the day after)
 those can still be grouped in one session. Refer to the
 [definition of session](../glossary.md#session-entities) for more details.
 
-[^2]: Datetime format and the anonymization procedure are
+<sup>2</sup> Datetime format and the anonymization procedure are
 described in [Units](../common-principles.md#units).
diff --git a/src/modality-agnostic-files/data-summary-files.md b/src/modality-agnostic-files/data-summary-files.md
index a95fe890a1..0892aac038 100644
--- a/src/modality-agnostic-files/data-summary-files.md
+++ b/src/modality-agnostic-files/data-summary-files.md
@@ -210,22 +210,14 @@ meg/sub-control01_task-rest_split-02_meg.nii.gz	1877-06-15T12:15:27
 
 ## Sessions file
 
-### Option 1: Segregated sessions files
-
-```Text
-sub-<label>/
-    sub-<label>_sessions.tsv
-    [sub-<label>_sessions.json]
-```
-
-Optional: Yes
-
-In case of multiple sessions there is an option of adding additional
-`sessions.tsv` files describing each session and variables changing between sessions.
-In such case one file per participant SHOULD be added.
-These files MUST include a `session_id` column and describe each session by one and only one row.
-Column names in `sessions.tsv` files MUST be different from group level participant key column names in the
-[`participants.tsv` file](#participants-file).
+In case of multiple sessions there is an option of adding an additional
+`sessions.tsv` file describing each session and variables changing between sessions.
+It is RECOMMENDED to provide this as a single file at the root-level of the dataset.
+It is OPTIONAL to provide these as separate files at the subject-level of the dataset.
+The intent of the sessions file is to describe the sessions
+in a data set and non-demographic variables changing between sessions.
+Column names in `sessions.tsv` files MUST be different from participant key column names in
+the [participants file](#participants-file).
 
 <!-- This block generates a columns table.
 The definitions of these fields can be found in
@@ -235,31 +227,48 @@ and a guide for using macros can be found at
 -->
 {{ MACROS___make_columns_table("modality_agnostic.Sessions") }}
 
-`sub-<label>/sub-<label>_sessions.tsv` example:
+`sessions.json` example:
 
-```tsv
-session_id	acq_time	systolic_blood_pressure
-ses-predrug	2009-06-15T13:45:30	120
-ses-postdrug	2009-06-16T13:45:30	100
-ses-followup	2009-06-17T13:45:30	110
+```JSON
+{
+    "participant_id": {
+        "Description": "Participant identifier"
+    },
+    "session_id": {
+        "Description": "Session identifier for the session",
+        "Levels": {
+            "ses-predrug": "session before drug administration",
+            "ses-postdrug": "session after drug administration",
+            "ses-followup": "follow-up session"
+        }
+    },
+    "acq_time": {
+        "Description": "Acquisition time of the session"
+    },
+    "systolic_blood_pressure": {
+        "Description": "Systolic blood pressure measured at the beginning of the session in mmHg"
+    }
+}
 ```
 
-### Option 2: Aggregated sessions file
+### RECOMMENDED: Root-level sessions file
 
-```Text
-sessions.tsv
-sessions.json
-```
+<!-- This block generates a file tree.
+A guide for using macros can be found at
+ https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
+-->
+{{ MACROS___make_filetree_example(
+   {
+   "sessions.tsv": "",
+   "[sessions.json]": "",
+   }
+) }}
 
 Optional: Yes
 
-An aggregated sessions file CAN be provided at the dataset root.
+An aggregated sessions file is RECOMMENDED to be provided at the dataset root.
 If a root-level sessions file is provided, then it MUST begin with
 a `participant_id` column followed immediately after by a `session_id` column.
-The intent of this root-level sessions file is to describe the sessions
-in a data set and non-demographic variables changing between sessions.
-Participant's demographic variables should be added to
-the [participants file](#participants-file), as described above.
 
 `sessions.tsv` example:
 
@@ -274,28 +283,32 @@ sub-03	ses-postdrug	2009-06-30T14:06:40	115
 sub-03	ses-followup	2009-07-01T14:06:40	120
 ```
 
-`sessions.json` example:
+### OPTIONAL: Subject-level sessions files
 
-```JSON
-{
-    "participant_id": {
-        "Description": "Participant identifier"
-    },
-    "session_id": {
-        "Description": "Session identifier for the session",
-        "Levels": {
-            "ses-predrug": "session before drug administration",
-            "ses-postdrug": "session after drug administration",
-            "ses-followup": "follow-up session"
-        }
-    },
-    "acq_time": {
-        "Description": "Acquisition time of the session"
-    },
-    "systolic_blood_pressure": {
-        "Description": "Systolic blood pressure measured at the beginning of the session in mmHg"
-    }
-}
+<!-- This block generates a file tree.
+A guide for using macros can be found at
+ https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
+-->
+{{ MACROS___make_filetree_example(
+   {
+   "sub-<label>": {
+      "sub-<label>_sessions.tsv": "",
+      "[sub-<label>_sessions.json]": "",
+      }
+   }
+) }}
+
+Optional: Yes
+
+When one sessions file per participant is used,
+these files MUST include a `session_id` column and describe each session by one and only one row.
+`sub-<label>/sub-<label>_sessions.tsv` example:
+
+```tsv
+session_id	acq_time	systolic_blood_pressure
+ses-predrug	2009-06-15T13:45:30	120
+ses-postdrug	2009-06-16T13:45:30	100
+ses-followup	2009-06-17T13:45:30	110
 ```
 
 ### Additional validation
@@ -314,7 +327,7 @@ the following expectations apply to sessions files.
 
 1.  When a root-level sessions file is in use, you MUST NOT provide
     additional sessions files at the participant-level
-    which would otherwise use the inheritance principle.
+    which would otherwise obey the inheritance principle.
 
 1.  Whenever possible, it is RECOMMENDED to also collect acquisition time
     for tabular phenotypic data and store the time of acquisition of each row
diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index 7be753c8a7..ce8cb8f852 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -6,8 +6,7 @@ Participants:
     - participant_id
     - session_id
   columns:
-    participant_id:
-      level: required
+    participant_id: required
     session_id: recommended
     species: recommended
     age: recommended
@@ -95,12 +94,9 @@ Sessions:
     - run_id
   columns:
     participant_id: optional
-    session_id:
-      level: required
-      description_addendum: |
-        There MUST be exactly one row for each session.
+    session_id: required
     run_id: optional
-    acq_time__sessions: optional
+    acq_time__sessions: recommended
     pathology: recommended
     HED: optional
   index_columns: [participant_id, session_id, run_id]

From fe3ddab821b38744504cea8a5146915eedeac239 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Mon, 29 Sep 2025 10:51:04 -0700
Subject: [PATCH 37/53] Update appendices/phenotype.md

Thanks for catching that excess newline, remark!
---
 src/appendices/phenotype.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index 7810e791c0..db65419282 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -14,7 +14,6 @@ You can make them mandatory during validation by setting the
 [`AdditionalValidation` key](../modality-agnostic-files/dataset-description.md#additional-validation)
 to `"Phenotype"` in the `dataset_description.json`.
 
-
 ### 1. Aggregate data across sessions
 
 Aggregate participant information across all sessions into one tabular TSV file per

From 40f675189eb35245e8a68b714053b6f1f69a680a Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Tue, 30 Sep 2025 06:10:44 -0700
Subject: [PATCH 38/53] Update
 src/schema/rules/tabular_data/modality_agnostic.yaml

Remove acq_time as a phenotype column recommendation/option, as it should go into the sessions file instead.
---
 src/schema/rules/tabular_data/modality_agnostic.yaml | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index ce8cb8f852..c223e6e1b0 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -48,11 +48,6 @@ Phenotypes:
         A chronological `run` number is used when
         a measurement tool or assessment described by a tabular file
         was repeated within a session.
-    acq_time__phenotype:
-      level: optional
-      description_addendum: |
-        If acquisition time is available, the `acq_time` column CAN be used
-        to record the time of acquisition of each row in the tabular file.
   index_columns: [participant_id, session_id, run_id]
   additional_columns: allowed
 

From 2fd12d7f05609ecdc773eaf94be2d08b5a38decf Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Tue, 30 Sep 2025 06:15:19 -0700
Subject: [PATCH 39/53] Update src/schema/objects/columns.yaml

Remove acq_time__phenotype from columns.yaml since it was removed from the rest of the schema.
---
 src/schema/objects/columns.yaml | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/src/schema/objects/columns.yaml b/src/schema/objects/columns.yaml
index 63824fe9f6..bbfb2882a0 100644
--- a/src/schema/objects/columns.yaml
+++ b/src/schema/objects/columns.yaml
@@ -12,15 +12,6 @@ abbreviation:
   description: |
     The unique label abbreviation
   type: string
-acq_time__phenotype:
-  name: acq_time
-  display_name: Phenotypic and assessment data acquisition time
-  description: |
-    Acquisition time refers to when the first data point in each run was acquired.
-    Datetime format and their deidentification are described in
-    [Units](SPEC_ROOT/common-principles.md#units).
-  type: string
-  format: datetime
 acq_time__scans:
   name: acq_time
   display_name: Scan acquisition time

From a0cab8ba9b000a8f8d8c31c7f30a0be95dcc1ba1 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Tue, 30 Sep 2025 06:21:29 -0700
Subject: [PATCH 40/53] Update src/appendices/phenotype.md

Accept Sebastian's suggestion about the phrasing of guideline 8.

Co-authored-by: Sebastian Urchs <surchs@users.noreply.github.com>
---
 src/appendices/phenotype.md | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index db65419282..e9e3feff33 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -100,8 +100,9 @@ and indoor or outdoor experimental conditions.
 
 ### 8. Use either root-level sessions file or participant-level sessions files, but not both
 
-When a sessions file is in use, you MUST NOT provide additional sessions files
-at the participant-level which would otherwise use the inheritance principle.
+When you use a sessions file at the dataset-level, 
+you MUST NOT provide additional sessions files at the participant-level 
+as this might conflict with the inheritance principle.
 
 ### 9. Record acquisition time of all sessions with `acq_time`
 

From c0bd78af6a4ee41bbb7bacb984acfc35033aa6c9 Mon Sep 17 00:00:00 2001
From: "pre-commit-ci[bot]"
 <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Date: Tue, 30 Sep 2025 13:21:52 +0000
Subject: [PATCH 41/53] [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci
---
 src/appendices/phenotype.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index e9e3feff33..d3b58a1912 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -100,8 +100,8 @@ and indoor or outdoor experimental conditions.
 
 ### 8. Use either root-level sessions file or participant-level sessions files, but not both
 
-When you use a sessions file at the dataset-level, 
-you MUST NOT provide additional sessions files at the participant-level 
+When you use a sessions file at the dataset-level,
+you MUST NOT provide additional sessions files at the participant-level
 as this might conflict with the inheritance principle.
 
 ### 9. Record acquisition time of all sessions with `acq_time`

From 97917f0fd476e4269d5406e1a250e6d10a132c53 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Tue, 30 Sep 2025 06:24:35 -0700
Subject: [PATCH 42/53] Update
 src/modality-agnostic-files/data-summary-files.md

Changing "subject-level" to "participant-level" in sessions files section.
---
 src/modality-agnostic-files/data-summary-files.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/modality-agnostic-files/data-summary-files.md b/src/modality-agnostic-files/data-summary-files.md
index 0892aac038..96c69b910b 100644
--- a/src/modality-agnostic-files/data-summary-files.md
+++ b/src/modality-agnostic-files/data-summary-files.md
@@ -283,7 +283,7 @@ sub-03	ses-postdrug	2009-06-30T14:06:40	115
 sub-03	ses-followup	2009-07-01T14:06:40	120
 ```
 
-### OPTIONAL: Subject-level sessions files
+### OPTIONAL: Participant-level sessions files
 
 <!-- This block generates a file tree.
 A guide for using macros can be found at

From 76932fee9adfdbabf7580c00c887cd1cf0de0728 Mon Sep 17 00:00:00 2001
From: Sebastian Urchs <sebastian.urchs@gmail.com>
Date: Tue, 30 Sep 2025 17:17:54 -0400
Subject: [PATCH 43/53] Move longitudinal age section to point 1

---
 src/appendices/phenotype.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index d3b58a1912..8d24c7db38 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -20,6 +20,8 @@ Aggregate participant information across all sessions into one tabular TSV file
 measurement or phenotypic assessment and store this file in the `/phenotype` directory.
 Demographic information is a special case and  MUST be aggregated
 in the `participants.tsv` file at the root level of the dataset.
+It is RECOMMENDED to use the `age` column in the `participants.tsv` file
+to record participant age at every session in longitudinal or multi-session data sets.
 
 ### 2. Always pair tabular data with data dictionaries
 

From 32f994e29675337b8f4b77d55e3fd9c7a89665db Mon Sep 17 00:00:00 2001
From: Sebastian Urchs <sebastian.urchs@gmail.com>
Date: Tue, 30 Sep 2025 17:26:58 -0400
Subject: [PATCH 44/53] Revise section 5

To better differentiate demographic
data from phenotypic data
---
 src/appendices/phenotype.md | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index 8d24c7db38..e67a169060 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -68,15 +68,13 @@ and a guide for using macros can be found at
 
 This rule can be considered as "**if anyone uses sessions, everyone uses sessions**."
 
-### 5. Store longitudinal age in the participants file
-
-It is RECOMMENDED to use the `age` column to record participant age
-at every session in longitudinal or multi-session data sets.
-This reduces data duplication across tabular data files. The `Units` of `age`
-do not have to be years so long as the units of the age
-are written in `participants.json`.
-Consider participant privacy or study objectives when selecting
-the `Units` of `age` or the accuracy of `age` data.
+### 5. Store demographic data in `participants.tsv` and instrument data in the `/phenotype` directory
+
+The `participants.tsv` file is for demographic information about the participant,
+including longitudinal information such as `age`.
+The `/phenotype` directory is for phenotypic information collected about
+the participants, such as questionnaires, cognitive assessments or tasks.
+Create one tabular `.tsv` file for each instrument or assessment in the `/phenotype` directory.
 
 ### 6. Use the sessions file at the root-level
 

From 3a602ca98ffb9a15639a6d2a87319156331e2090 Mon Sep 17 00:00:00 2001
From: "Christopher J. Markiewicz" <markiewicz@stanford.edu>
Date: Wed, 1 Oct 2025 13:44:13 -0400
Subject: [PATCH 45/53] fix: Participant ID mismatch check

---
 src/schema/rules/checks/dataset.yaml | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/schema/rules/checks/dataset.yaml b/src/schema/rules/checks/dataset.yaml
index 5fbf91c4a6..32859105c2 100644
--- a/src/schema/rules/checks/dataset.yaml
+++ b/src/schema/rules/checks/dataset.yaml
@@ -38,10 +38,8 @@ ParticipantIDMismatch:
     - path == '/participants.tsv'
   checks:
     - |
-      allequal(
-        sorted(intersects(columns.participant_id, dataset.subjects.sub_dirs)),
-        sorted(dataset.subjects.sub_dirs)
-      )
+      length(intersects(unique(columns.participant_id), dataset.subjects.sub_dirs))) ==
+      length(dataset.subjects.sub_dirs)
 
 # 214
 SamplesTSVMissing:

From f8d492ea4b5363111561265c12befd557073385a Mon Sep 17 00:00:00 2001
From: "Christopher J. Markiewicz" <markiewicz@stanford.edu>
Date: Wed, 1 Oct 2025 13:52:04 -0400
Subject: [PATCH 46/53] fix(schema): Resolve a couple issues

---
 src/schema/rules/checks/dataset.yaml                 | 2 +-
 src/schema/rules/tabular_data/modality_agnostic.yaml | 4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/schema/rules/checks/dataset.yaml b/src/schema/rules/checks/dataset.yaml
index 32859105c2..d2aca48eff 100644
--- a/src/schema/rules/checks/dataset.yaml
+++ b/src/schema/rules/checks/dataset.yaml
@@ -38,7 +38,7 @@ ParticipantIDMismatch:
     - path == '/participants.tsv'
   checks:
     - |
-      length(intersects(unique(columns.participant_id), dataset.subjects.sub_dirs))) ==
+      length(intersects(unique(columns.participant_id), dataset.subjects.sub_dirs)) ==
       length(dataset.subjects.sub_dirs)
 
 # 214
diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index c223e6e1b0..806a924400 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -103,9 +103,11 @@ Phenotype:
   initial_columns:
     - participant_id
     - session_id
+    - run_id
   columns:
     participant_id: required
     session_id: optional
+    run_id: optional
     HED: optional
-  index_columns: [participant_id, session_id]
+  index_columns: [participant_id, session_id, run_id]
   additional_columns: allowed

From 7f1eb09233bed384509e6ee7e646729a4e8f6778 Mon Sep 17 00:00:00 2001
From: "Christopher J. Markiewicz" <markiewicz@stanford.edu>
Date: Wed, 1 Oct 2025 14:43:16 -0400
Subject: [PATCH 47/53] feat: Expanded validation checks

---
 .../rules/tabular_data/modality_agnostic.yaml | 20 +++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index 806a924400..0ae919c640 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -83,6 +83,7 @@ Sessions:
   selectors:
     - suffix == "sessions"
     - extension == ".tsv"
+    - '!intersects(dataset.dataset_description.AdditionalValidation, ["Phenotype"])'
   initial_columns:
     - participant_id
     - session_id
@@ -97,9 +98,21 @@ Sessions:
   index_columns: [participant_id, session_id, run_id]
   additional_columns: allowed
 
+Sessions__Additional:
+  $ref: rules.tabular_data.modality_agnostic.Sessions
+  selectors:
+    - suffix == "sessions"
+    - extension == ".tsv"
+    - intersects(dataset.dataset_description.AdditionalValidation, ["Phenotype"])
+  columns:
+    $ref: rules.tabular_data.modality_agnostic.Sessions.columns
+    acq_time__sessions: required
+  additional_columns: allowed_if_defined
+
 Phenotype:
   selectors:
     - datatype == 'phenotype'
+    - '!intersects(dataset.dataset_description.AdditionalValidation, ["Phenotype"])'
   initial_columns:
     - participant_id
     - session_id
@@ -111,3 +124,10 @@ Phenotype:
     HED: optional
   index_columns: [participant_id, session_id, run_id]
   additional_columns: allowed
+
+Phenotype__Additional:
+  $ref: rules.tabular_data.modality_agnostic.Phenotype
+  selectors:
+    - datatype == 'phenotype'
+    - intersects(dataset.dataset_description.AdditionalValidation, ["Phenotype"])
+  additional_columns: allowed_if_defined

From 5c55eb97b078fc2d4aa4a53ff0cfaf035cb486fa Mon Sep 17 00:00:00 2001
From: "Christopher J. Markiewicz" <markiewicz@stanford.edu>
Date: Wed, 1 Oct 2025 14:51:28 -0400
Subject: [PATCH 48/53] deduplicate

---
 .../rules/tabular_data/modality_agnostic.yaml | 34 ++++++-------------
 1 file changed, 10 insertions(+), 24 deletions(-)

diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index 0ae919c640..bdadf85410 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -20,8 +20,9 @@ Participants:
 
 Phenotypes:
   selectors:
-    - datatype == "phenotype"
+    - datatype == 'phenotype'
     - extension == ".tsv"
+    - '!intersects(dataset.dataset_description.AdditionalValidation, ["Phenotype"])'
   initial_columns:
     - participant_id
     - session_id
@@ -48,9 +49,17 @@ Phenotypes:
         A chronological `run` number is used when
         a measurement tool or assessment described by a tabular file
         was repeated within a session.
+    HED: optional
   index_columns: [participant_id, session_id, run_id]
   additional_columns: allowed
 
+Phenotypes__Additional:
+  $ref: rules.tabular_data.modality_agnostic.Phenotypes
+  selectors:
+    - datatype == 'phenotype'
+    - intersects(dataset.dataset_description.AdditionalValidation, ["Phenotype"])
+  additional_columns: allowed_if_defined
+
 Samples:
   selectors:
     - path == "/samples.tsv"
@@ -108,26 +117,3 @@ Sessions__Additional:
     $ref: rules.tabular_data.modality_agnostic.Sessions.columns
     acq_time__sessions: required
   additional_columns: allowed_if_defined
-
-Phenotype:
-  selectors:
-    - datatype == 'phenotype'
-    - '!intersects(dataset.dataset_description.AdditionalValidation, ["Phenotype"])'
-  initial_columns:
-    - participant_id
-    - session_id
-    - run_id
-  columns:
-    participant_id: required
-    session_id: optional
-    run_id: optional
-    HED: optional
-  index_columns: [participant_id, session_id, run_id]
-  additional_columns: allowed
-
-Phenotype__Additional:
-  $ref: rules.tabular_data.modality_agnostic.Phenotype
-  selectors:
-    - datatype == 'phenotype'
-    - intersects(dataset.dataset_description.AdditionalValidation, ["Phenotype"])
-  additional_columns: allowed_if_defined

From c0951f330a680ee81048b020556d9d57dc4fbe8b Mon Sep 17 00:00:00 2001
From: "Christopher J. Markiewicz" <markiewicz@stanford.edu>
Date: Wed, 1 Oct 2025 14:53:40 -0400
Subject: [PATCH 49/53] Promote Participants

---
 src/schema/rules/tabular_data/modality_agnostic.yaml | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/src/schema/rules/tabular_data/modality_agnostic.yaml b/src/schema/rules/tabular_data/modality_agnostic.yaml
index bdadf85410..fd6b919b1c 100644
--- a/src/schema/rules/tabular_data/modality_agnostic.yaml
+++ b/src/schema/rules/tabular_data/modality_agnostic.yaml
@@ -2,6 +2,7 @@
 Participants:
   selectors:
     - path == "/participants.tsv"
+    - '!intersects(dataset.dataset_description.AdditionalValidation, ["Phenotype"])'
   initial_columns:
     - participant_id
     - session_id
@@ -18,6 +19,17 @@ Participants:
   index_columns: [participant_id, session_id]
   additional_columns: allowed
 
+Participants__Additional:
+  $ref: rules.tabular_data.modality_agnostic.Participants
+  selectors:
+    - path == "/participants.tsv"
+    - intersects(dataset.dataset_description.AdditionalValidation, ["Phenotype"])
+  columns:
+    participant_id: required
+    session_id: recommended
+    HED: optional
+  additional_columns: allowed_if_defined
+
 Phenotypes:
   selectors:
     - datatype == 'phenotype'

From 69183c78ab6986dad601a5d0f54dedf1b720db77 Mon Sep 17 00:00:00 2001
From: "Christopher J. Markiewicz" <markiewicz@stanford.edu>
Date: Thu, 2 Oct 2025 13:03:04 -0400
Subject: [PATCH 50/53] feat: Make MeasurementToolMetadata recommended for
 phenotype files

---
 src/schema/rules/sidecars/phenotype.yaml | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)
 create mode 100644 src/schema/rules/sidecars/phenotype.yaml

diff --git a/src/schema/rules/sidecars/phenotype.yaml b/src/schema/rules/sidecars/phenotype.yaml
new file mode 100644
index 0000000000..8b97b40c23
--- /dev/null
+++ b/src/schema/rules/sidecars/phenotype.yaml
@@ -0,0 +1,18 @@
+---
+MeasurementToolMetadata:
+  selectors:
+    - datatype == 'phenotype'
+    - extension == '.tsv'
+    - '!intersects(dataset.dataset_description.AdditionalValidation, ["Phenotype"])'
+  fields:
+    MeasurementToolMetadata:
+      level: optional
+      level_addendum: recommended by phenotype guidelines
+
+MeasurementToolMetadataRec:
+  selectors:
+    - datatype == 'phenotype'
+    - extension == '.tsv'
+    - intersects(dataset.dataset_description.AdditionalValidation, ["Phenotype"])
+  fields:
+    MeasurementToolMetadata: recommended

From d3f1d0d1ac419b479a165b1d16d327d801572487 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Sat, 11 Oct 2025 06:38:19 -0700
Subject: [PATCH 51/53] Updates for community review of BEP036

Made changes to align with final feedback prior to community review.
---
 src/appendices/phenotype.md                   | 60 +++++++++++--------
 .../data-summary-files.md                     | 48 +++++++--------
 .../dataset-description.md                    |  2 +-
 .../phenotypic-and-assessment-data.md         | 56 +++++------------
 src/schema/objects/metadata.yaml              |  2 +-
 5 files changed, 73 insertions(+), 95 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index e67a169060..6f2b23f857 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -12,7 +12,7 @@ and phenotypic and assessment data.
 They are recommendations and are by default ignored during validation.
 You can make them mandatory during validation by setting the
 [`AdditionalValidation` key](../modality-agnostic-files/dataset-description.md#additional-validation)
-to `"Phenotype"` in the `dataset_description.json`.
+contains `"Phenotype"` in the `dataset_description.json`.
 
 ### 1. Aggregate data across sessions
 
@@ -39,7 +39,7 @@ See [`MeasurementToolMetadata` in the glossary](../glossary.md#measurementtoolme
 
 ### 4. Ensure minimal annotation for phenotypic and assessment data
 
-In phenotypic and assessment data each measurement tool SHOULD have an independent
+In phenotypic and assessment data, each measurement tool SHOULD have an independent
 aggregated data TSV file in which the user collects all subjects, sessions,
 and/or runs of data as one entry per row (with a row defined by
 the smallest unit of acquisition). In other words:
@@ -54,9 +54,9 @@ the smallest unit of acquisition). In other words:
 -   If more than one of the same measurement tool is acquired within
     the same `session_id`, a `run_id` column MUST be added.
 
--   Encoding  the acquisition time for a measurement tool’s `session_id`,
-     is RECOMMENDED. This information MUST be stored in the `sessions.tsv`
-     file at the root level of the dataset in the `acq_time` column.
+-   Encoding the acquisition time for a measurement tool’s `session_id`,
+    is RECOMMENDED. This information MUST be stored in the `sessions.tsv`
+    file at the root level of the dataset in the `acq_time` column.
 
 <!-- This block generates a columns table.
 The definitions of these fields can be found in
@@ -66,27 +66,29 @@ and a guide for using macros can be found at
 -->
 {{ MACROS___make_columns_table("modality_agnostic.Phenotypes") }}
 
-This rule can be considered as "**if anyone uses sessions, everyone uses sessions**."
+Furthermore, if you have to add a `session_id` column to the tabular phenotypic data,
+you then MUST also introduce a session directory to the imaging data,
+even if only one imaging session has been created.
+This rule can be considered as "**if anyone uses sessions, everyone uses sessions.**"
+And vice versa, if imaging data has session directories,
+all imaging data and tabular phenotypic data MUST have sessions.
 
-### 5. Store demographic data in `participants.tsv` and instrument data in the `/phenotype` directory
+This produces a file in which same-participant entries can take up as many rows as needed
+according to the smallest unit of acquisition.
+The combination of values in the `participant_id`, `session_id`, and `run_id` (if present)
+columns MUST be unique for the entire tabular file.
 
-The `participants.tsv` file is for demographic information about the participant,
-including longitudinal information such as `age`.
-The `/phenotype` directory is for phenotypic information collected about
-the participants, such as questionnaires, cognitive assessments or tasks.
-Create one tabular `.tsv` file for each instrument or assessment in the `/phenotype` directory.
-
-### 6. Use the sessions file at the root-level
+### 5. Store demographic data in the participants file and instrument data in the phenotype directory
 
-If there is more than one session for any one participant, then
-it is RECOMMENDED to provide a sessions file at the dataset root.
-The sessions file MUST list all sessions for all subjects across
-imaging and tabular phenotypic data.
-If a sessions file is provided, then it MUST begin with a `participant_id` column
-followed immediately by a `session_id` column. The data dictionary JSON file’s
-`session_id` field MUST include `Levels` with the description of each `session_id`.
+The participants file is for demographic data about the participant,
+including longitudinal information such as `age`.
+The phenotypic and assessment data directory
+is for phenotypic measurement instruments collected about the participants
+such as questionnaires, surveys, and cognitive assessments.
+Create one tabular file for each instrument
+in the phenotypic and assessment data directory.
 
-### 7. Record participant properties in the participants file and session properties in the sessions file
+### 6. Record participant properties in the participants file and session properties in the sessions file
 
 Since the same `participant_id` and `session_id` columns can be used
 similarly in the participants file and the sessions file,
@@ -98,6 +100,14 @@ Properties of sessions MAY include things like
 acquisition time, measurement device properties,
 and indoor or outdoor experimental conditions.
 
+### 7. Use the sessions file at the root-level
+
+If there is more than one session for any one participant, then
+it is RECOMMENDED to provide a sessions file at the dataset root.
+The sessions file MUST list all sessions for all subjects across
+imaging and tabular phenotypic data. The data dictionary JSON file’s
+`session_id` field MUST include `Levels` with the description of each `session_id`.
+
 ### 8. Use either root-level sessions file or participant-level sessions files, but not both
 
 When you use a sessions file at the dataset-level,
@@ -107,7 +117,8 @@ as this might conflict with the inheritance principle.
 ### 9. Record acquisition time of all sessions with `acq_time`
 
 It is RECOMMENDED to store acquisition time[<sup>2</sup>](#footnotes)
-for tabular phenotypic data in the sessions file in a column named `acq_time`.
+for tabular phenotypic data and store the time of acquisition of each row
+inside a column named `acq_time` in the sessions file.
 This is consistent with how acquisition time is recorded for MRI data
 and other time-sensitive measurements (for example systolic blood pressure).
 
@@ -123,7 +134,8 @@ using the `acq_time` column.
 
 This appendix described guidelines for best tabular phenotypic data.
 In summary, it is RECOMMENDED to always use the participants file
-and separate files by assessment in the `/phenotype/` directory,
+and separate files by measurement instrument in
+the phenotypic and assessment data directory,
 since they each collect different information.
 If you use sessions, then the sessions file is also RECOMMENDED.
 
diff --git a/src/modality-agnostic-files/data-summary-files.md b/src/modality-agnostic-files/data-summary-files.md
index 96c69b910b..1b1976453e 100644
--- a/src/modality-agnostic-files/data-summary-files.md
+++ b/src/modality-agnostic-files/data-summary-files.md
@@ -213,7 +213,7 @@ meg/sub-control01_task-rest_split-02_meg.nii.gz	1877-06-15T12:15:27
 In case of multiple sessions there is an option of adding an additional
 `sessions.tsv` file describing each session and variables changing between sessions.
 It is RECOMMENDED to provide this as a single file at the root-level of the dataset.
-It is OPTIONAL to provide these as separate files at the subject-level of the dataset.
+It is OPTIONAL to instead provide these as separate files at the subject-level of the dataset.
 The intent of the sessions file is to describe the sessions
 in a data set and non-demographic variables changing between sessions.
 Column names in `sessions.tsv` files MUST be different from participant key column names in
@@ -315,31 +315,25 @@ ses-followup	2009-06-17T13:45:30	110
 
 When the [`AdditionalValidation` key](dataset-description.md#additional-validation)
 contains `"Phenotype"` in the `dataset_description.json`,
-the following expectations apply to sessions files.
-
-1.  If there is more than one session for any one participant, then it is
-    REQUIRED to provide a sessions file at the dataset root.
-    The sessions file MUST list all sessions for all subjects
-    across imaging and tabular phenotypic data. If a sessions file is provided, then
-    it MUST begin with a `participant_id` column followed immediately by
-    a `session_id` column. The data dictionary JSON file's `session_id` field
-    MUST include `Levels` with the description of each `session_id`.
-
-1.  When a root-level sessions file is in use, you MUST NOT provide
-    additional sessions files at the participant-level
-    which would otherwise obey the inheritance principle.
-
-1.  Whenever possible, it is RECOMMENDED to also collect acquisition time
-    for tabular phenotypic data and store the time of acquisition of each row
-    inside a column named `acq_time` in the sessions file.
-    This is consistent with how acquisition time is recorded for MRI data
-    and other time-sensitive measurements (for example systolic blood pressure).
-
-1.  When it is needed to preserve participant privacy, you SHOULD record
-    relative acquisition times with respect to the earliest session.
-    Relative session acquisition times MAY be listed as durations from
-    the earliest session (baseline) in days, months, or years
-    using the `acq_time` column.
+the following tabular phenotypic data guidelines
+apply to sessions files:
+
+-   [6.](../appendices/phenotype.md#6-record-participant-properties-in-the-participants-file-and-session-properties-in-the-sessions-file)
+    Record participant properties in the participants file
+    and session properties in the sessions file
+
+-   [7.](../appendices/phenotype.md#7-use-the-sessions-file-at-the-root-level)
+    Use the sessions file at the root-level
+
+-   [8.](../appendices/phenotype.md#8-use-either-root-level-sessions-file-or-participant-level-sessions-files-but-not-both)
+    Use either root-level sessions file or
+    participant-level sessions files, but not both
+
+-   [9.](../appendices/phenotype.md#9-record-acquisition-time-of-all-sessions-with-acq_time)
+    Record acquisition time of all sessions with `acq_time`
+
+-   [10.](../appendices/phenotype.md#10-respect-participant-privacy-when-recording-acquisition-times)
+    Respect participant privacy when recording acquisition times
 
 To read more about the guidelines for tabular phenotypic data and examples,
-see the [Tabular phenotypic data guidelines appendix](../appendices/phenotype.md).
+see the [tabular phenotypic data guidelines appendix](../appendices/phenotype.md).
diff --git a/src/modality-agnostic-files/dataset-description.md b/src/modality-agnostic-files/dataset-description.md
index c9e73b11cd..bd6a42b95d 100644
--- a/src/modality-agnostic-files/dataset-description.md
+++ b/src/modality-agnostic-files/dataset-description.md
@@ -169,7 +169,7 @@ Example:
 
 The `AdditionalValidation` key MAY be used to opt into additional validation
 to be performed on the dataset beyond standard BIDS validation.
-The value of this field is either a string or an array of strings,
+The value of this field is an array of strings,
 each of which MUST be the name of a supported additional validation to be performed.
 
 The currently supported values are:
diff --git a/src/modality-agnostic-files/phenotypic-and-assessment-data.md b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
index fb93c5561a..7c75a785c9 100644
--- a/src/modality-agnostic-files/phenotypic-and-assessment-data.md
+++ b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
@@ -97,52 +97,24 @@ questionnaire).
 
 When the [`AdditionalValidation` key](dataset-description.md#additional-validation)
 contains `"Phenotype"` in the `dataset_description.json`,
-the following expectations apply to phenotypic and assessment data.
+the following tabular phenotypic data guidelines
+apply to phenotypic and assessment data.
 
-1.  It is REQUIRED to aggregate all participant data into
-    one TSV per tabular phenotypic file.
+-   [1.](../appendices/phenotype.md#1-aggregate-data-across-sessions)
+    Aggregate data across sessions
 
-1.  Each tabular phenotypic data TSV file MUST be accompanied by
-    a corresponding data dictionary JSON file.
+-   [2.](../appendices/phenotype.md#2-always-pair-tabular-data-with-data-dictionaries)
+    Always pair tabular data with data dictionaries
 
-1.  Whenever possible, it is RECOMMENDED to add `MeasurementToolMetadata` to
-    each `phenotype/<measurement_tool_name>.json` data dictionary.
-    This improves reusability and provides clarity about the measurement tool.
+-   [3.](../appendices/phenotype.md#3-add-measurementtoolmetadata-to-each-tabular-phenotypic-measurement-tool)
+    Add `MeasurementToolMetadata` to each tabular phenotypic measurement tool
 
-1.  Each measurement tool SHOULD have an independent
-    aggregated data TSV file in which the user collects all subjects, sessions,
-    and/or runs of data as one entry per row (with a row defined by
-    the smallest unit of acquisition). In other words:
+-   [4.](../appendices/phenotype.md#4-ensure-minimal-annotation-for-phenotypic-and-assessment-data)
+    Ensure minimal annotation for phenotypic and assessment data
 
-    1.  Each row MUST start with `participant_id`.
-
-    1.  Each TSV file MUST contain a `session_id` column when
-        multiple [sessions](../glossary.md#session-entities) are present
-        in the data set regardless of whether those sessions are in
-        the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
-        See the first two examples in [the appendix](../appendices/phenotype.md).
-
-    1.  If more than one of the same measurement tool is acquired within
-        the same `session_id`, a `run_id` column MUST be added.
-
-    1.  To encode the acquisition time for a tabular phenotypic file’s `session_id`,
-        add the `session_id` to the sessions file and
-        include the OPTIONAL `acq_time` column.
-
-    To see this guideline summarized as a table,
-    see [the appendix](../appendices/phenotype.md#to-summarize-this-guideline-as-a-table).
-
-    Furthermore, if you have to add a `session_id` column to the tabular phenotypic data,
-    you then MUST also introduce a session directory to the imaging data,
-    even if only one imaging session has been created.
-    This rule can be considered as "**if anyone uses sessions, everyone uses sessions.**"
-    And vice versa, if imaging data has session directories,
-    all imaging data and tabular phenotypic data MUST have sessions.
-
-    This produces a file in which same-participant entries can take up as many rows as needed
-    according to the smallest unit of acquisition.
-    The combination of values in the `participant_id`, `session_id`, and `run_id` (if present)
-    columns MUST be unique for the entire tabular file.
+-   [5.](../appendices/phenotype.md#5-store-demographic-data-in-the-participants-file-and-instrument-data-in-the-phenotype-directory)
+    Store demographic data in the participants file
+    and instrument data in the phenotype directory
 
 To read more about the guidelines for tabular phenotypic data and examples,
-see the [Tabular phenotypic data guidelines appendix](../appendices/phenotype.md).
+see the [tabular phenotypic data guidelines appendix](../appendices/phenotype.md).
diff --git a/src/schema/objects/metadata.yaml b/src/schema/objects/metadata.yaml
index bdbc05c31c..295cd5d4a9 100644
--- a/src/schema/objects/metadata.yaml
+++ b/src/schema/objects/metadata.yaml
@@ -2265,7 +2265,7 @@ MeasurementToolMetadata:
     Contains two fields: `"Description"` and `"TermURL"`.
     `"Description"` is a free text description of the measurement tool.
     `"TermURL"` is a URL to an entity in an ontology corresponding to this tool.
-    RECOMMENDED by `AdditionalValidation` of `"Phenotype"` in `dataset_description.json`.
+    RECOMMENDED by `AdditionalValidation` containing `"Phenotype"` in `dataset_description.json`.
   type: object
   properties:
     TermURL:

From b60eac1e58aa74ea68228ae85ee158ce66b355c0 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Tue, 4 Nov 2025 11:30:24 -0800
Subject: [PATCH 52/53] Update phenotype.md and data-summary-files.md

Doing my best to address Chris' PR comment about a few pieces.
---
 src/appendices/phenotype.md                   | 23 ++++---------------
 .../data-summary-files.md                     |  9 +-------
 2 files changed, 5 insertions(+), 27 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index 6f2b23f857..3c84de812b 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -54,9 +54,8 @@ the smallest unit of acquisition). In other words:
 -   If more than one of the same measurement tool is acquired within
     the same `session_id`, a `run_id` column MUST be added.
 
--   Encoding the acquisition time for a measurement tool’s `session_id`,
-    is RECOMMENDED. This information MUST be stored in the `sessions.tsv`
-    file at the root level of the dataset in the `acq_time` column.
+-   A measurement tool’s acquisition time SHOULD be stored in the `sessions.tsv`
+    file at the root-level of the dataset in the `acq_time` column.
 
 <!-- This block generates a columns table.
 The definitions of these fields can be found in
@@ -69,7 +68,7 @@ and a guide for using macros can be found at
 Furthermore, if you have to add a `session_id` column to the tabular phenotypic data,
 you then MUST also introduce a session directory to the imaging data,
 even if only one imaging session has been created.
-This rule can be considered as "**if anyone uses sessions, everyone uses sessions.**"
+This guideline can be considered as "**if anyone uses sessions, everyone uses sessions.**"
 And vice versa, if imaging data has session directories,
 all imaging data and tabular phenotypic data MUST have sessions.
 
@@ -108,13 +107,7 @@ The sessions file MUST list all sessions for all subjects across
 imaging and tabular phenotypic data. The data dictionary JSON file’s
 `session_id` field MUST include `Levels` with the description of each `session_id`.
 
-### 8. Use either root-level sessions file or participant-level sessions files, but not both
-
-When you use a sessions file at the dataset-level,
-you MUST NOT provide additional sessions files at the participant-level
-as this might conflict with the inheritance principle.
-
-### 9. Record acquisition time of all sessions with `acq_time`
+### 8. Record acquisition time of all sessions with `acq_time`
 
 It is RECOMMENDED to store acquisition time[<sup>2</sup>](#footnotes)
 for tabular phenotypic data and store the time of acquisition of each row
@@ -122,14 +115,6 @@ inside a column named `acq_time` in the sessions file.
 This is consistent with how acquisition time is recorded for MRI data
 and other time-sensitive measurements (for example systolic blood pressure).
 
-### 10. Respect participant privacy when recording acquisition times
-
-When needed to preserve participant privacy, you SHOULD record
-relative acquisition times with respect to the earliest session.
-Relative session acquisition times MAY be listed as durations from
-the earliest session (baseline) in days, months, or years
-using the `acq_time` column.
-
 ## Summary
 
 This appendix described guidelines for best tabular phenotypic data.
diff --git a/src/modality-agnostic-files/data-summary-files.md b/src/modality-agnostic-files/data-summary-files.md
index 1b1976453e..0389c41a81 100644
--- a/src/modality-agnostic-files/data-summary-files.md
+++ b/src/modality-agnostic-files/data-summary-files.md
@@ -325,15 +325,8 @@ apply to sessions files:
 -   [7.](../appendices/phenotype.md#7-use-the-sessions-file-at-the-root-level)
     Use the sessions file at the root-level
 
--   [8.](../appendices/phenotype.md#8-use-either-root-level-sessions-file-or-participant-level-sessions-files-but-not-both)
-    Use either root-level sessions file or
-    participant-level sessions files, but not both
-
--   [9.](../appendices/phenotype.md#9-record-acquisition-time-of-all-sessions-with-acq_time)
+-   [8.](../appendices/phenotype.md#8-record-acquisition-time-of-all-sessions-with-acq_time)
     Record acquisition time of all sessions with `acq_time`
 
--   [10.](../appendices/phenotype.md#10-respect-participant-privacy-when-recording-acquisition-times)
-    Respect participant privacy when recording acquisition times
-
 To read more about the guidelines for tabular phenotypic data and examples,
 see the [tabular phenotypic data guidelines appendix](../appendices/phenotype.md).

From d8b34f349d5640b4ddf835d42da878c9f5d5e5d9 Mon Sep 17 00:00:00 2001
From: Eric Earl <eric.earl@nih.gov>
Date: Tue, 4 Nov 2025 13:42:15 -0800
Subject: [PATCH 53/53] Update phenotype.md and
 phenotypic-and-assessment-data.md

Trying to address some of @rwblair's comments on the PR.
---
 src/appendices/phenotype.md                               | 8 +++-----
 .../phenotypic-and-assessment-data.md                     | 2 +-
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/src/appendices/phenotype.md b/src/appendices/phenotype.md
index 3c84de812b..1c70bb85f0 100644
--- a/src/appendices/phenotype.md
+++ b/src/appendices/phenotype.md
@@ -18,7 +18,7 @@ contains `"Phenotype"` in the `dataset_description.json`.
 
 Aggregate participant information across all sessions into one tabular TSV file per
 measurement or phenotypic assessment and store this file in the `/phenotype` directory.
-Demographic information is a special case and  MUST be aggregated
+Demographic information is a special case and SHOULD be aggregated
 in the `participants.tsv` file at the root level of the dataset.
 It is RECOMMENDED to use the `age` column in the `participants.tsv` file
 to record participant age at every session in longitudinal or multi-session data sets.
@@ -51,8 +51,8 @@ the smallest unit of acquisition). In other words:
     in the data set regardless of whether those sessions are in
     the `phenotype/` data, `sub-<label>/` data, or a combination of the two.
 
--   If more than one of the same measurement tool is acquired within
-    the same `session_id`, a `run_id` column MUST be added.
+-   If a measurement tool is acquired multiple times within a single session,
+    a `run_id` column MUST be added to disambiguate the separate acquisitions.
 
 -   A measurement tool’s acquisition time SHOULD be stored in the `sessions.tsv`
     file at the root-level of the dataset in the `acq_time` column.
@@ -74,8 +74,6 @@ all imaging data and tabular phenotypic data MUST have sessions.
 
 This produces a file in which same-participant entries can take up as many rows as needed
 according to the smallest unit of acquisition.
-The combination of values in the `participant_id`, `session_id`, and `run_id` (if present)
-columns MUST be unique for the entire tabular file.
 
 ### 5. Store demographic data in the participants file and instrument data in the phenotype directory
 
diff --git a/src/modality-agnostic-files/phenotypic-and-assessment-data.md b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
index 7c75a785c9..1a3958c035 100644
--- a/src/modality-agnostic-files/phenotypic-and-assessment-data.md
+++ b/src/modality-agnostic-files/phenotypic-and-assessment-data.md
@@ -14,7 +14,7 @@ If the dataset includes multiple sets of participant level measurements (for
 example responses from multiple questionnaires) they can be split into
 individual files separate from `participants.tsv`.
 
-Each of the measurement files MUST be kept in a `/phenotype` directory placed
+Each of the measurement tool files MUST be kept in a `/phenotype` directory placed
 at the root of the BIDS dataset and MUST end with the `.tsv` extension.
 Filenames SHOULD be chosen to reflect the contents of the file.
 For example, the "Adult ADHD Clinical Diagnostic Scale" could be saved in a file