Skip to content

Commit adc2f1f

Browse files
PhilippeCzajaTNGPhilippe Czaja
andauthored
Allow specifying additional GDPR data in yaml files (#27)
* feat: Allow specifying additional GDPR data in yaml files * docs: Update and improve README --------- Co-authored-by: Philippe Czaja <philippe.czaja-ext@rio.cloud>
1 parent 9005d56 commit adc2f1f

File tree

12 files changed

+514
-20
lines changed

12 files changed

+514
-20
lines changed

README.md

Lines changed: 139 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/rio-cloud/gradle-gdpr-documentation-plugin/build-and-deploy.yaml)
55

66
Gradle plugin to generate data classification documentation (needed for the GDPR documentation) for your project based
7-
on annotations on data classes.
7+
on annotations on data classes and/or from configuration files.
88

99
## Disclaimer
1010

@@ -13,18 +13,136 @@ on annotations on data classes.
1313
> make sure to classify the data according to your own requirements. RIO is not responsible for your documentation.
1414
1515
> [!NOTE]
16-
> RIO maintains this repository for their internal documentation. If you need different / additional functionality, please fork the project.
16+
> RIO maintains this repository for their internal documentation. If you need different / additional functionality,
17+
> please fork the project.
1718
1819
## Usage
1920

20-
See [example project](./test).
21+
### Classify your data
22+
23+
There are two ways to classify data:
24+
25+
1. Annotate your data classes with the provided annotations
26+
2. Provide configuration files
27+
28+
You can also combine both approaches. In case of conflicts, the configuration files will have precedence over the
29+
annotations.
30+
31+
See [example project](./test) for examples of both approaches.
32+
33+
#### Annotations
34+
35+
You can annotate your data classes with the following annotations (defined
36+
in [GdprData](./core/src/main/kotlin/cloud/rio/gdprdoc/annotations/GdprDoc.kt)), describing the data flow and purpose:
37+
38+
- `@Incoming` to mark incoming data (e.g. from a REST API)
39+
- `@Outgoing` to mark outgoing data (e.g. as a response to an API call from another service)
40+
- `@Persisted` to mark persisted data (e.g. in a database)
41+
- `@ReadModel` to mark read models (e.g. in a CQRS setup)
42+
43+
You can also use multiple of these annotations on a single class. Note that `@ReadModel` will automatically classify the
44+
data both as incoming and as persisted.
45+
46+
You can document the PII level of each field in the class with the `@Field` annotation which accepts the following enum
47+
values as parameter:
48+
49+
- `PII`
50+
- `PSEUDONYM`
51+
- `NON_PII`
52+
53+
#### Configuration files
54+
55+
You can also provide one or multiple configuration files in yaml format to classify your data.
56+
57+
Each configuration file consists of a list of classes, containing the fully qualified class name, one or multiple blocks
58+
describing the data flow and purpose of the data (analogous to the annotations), and a list of fields with their PII
59+
level.
60+
61+
The following example illustrates the structure of the configuration file:
62+
63+
```yaml
64+
classes:
65+
- className: cloud.rio.example.adapter.restclient.IncomingDTO
66+
incoming:
67+
whatToDo: Forward via API
68+
whereFrom: Some external service
69+
links:
70+
- cloud.rio.example.adapter.rest.OutgoingDTO
71+
fields:
72+
- name: id
73+
level: PSEUDONYM
74+
- name: name
75+
level: PII
76+
- name: description
77+
level: NON_PII
78+
- className: cloud.rio.example.adapter.rest.OutgoingDTO
79+
outgoing:
80+
sharedWith: Exposed via API
81+
why: Display in frontend
82+
links:
83+
- cloud.rio.example.adapter.restclient.IncomingDTO
84+
fields:
85+
...
86+
- className: cloud.rio.example.adapter.db.PersistedEntity
87+
persisted:
88+
retention: 6 months
89+
responsibleForDeletion: Automatic deletion job
90+
links: []
91+
fields:
92+
...
93+
- className: cloud.rio.example.adapter.readmodel.ReadModel
94+
readModel:
95+
whatToDo: Persist in DB
96+
whereFrom: Some external service
97+
retention: 6 months
98+
responsibleForDeletion: Automatic deletion job
99+
links: []
100+
fields:
101+
...
102+
```
103+
104+
### Apply and configure the plugin
105+
106+
To use the plugin, add the plugin to the plugins block of your `build.gradle.kts` file and add the core dependency to
107+
the compile time classpath:
108+
109+
```kotlin
110+
plugins {
111+
id("cloud.rio.gdprdoc") version "2.0.1"
112+
}
113+
114+
dependencies {
115+
compileOnly("cloud.rio.gdprdoc:core:2.0.1")
116+
}
117+
```
118+
119+
You can configure the documentation generation task to change the output file name and location, and to specify the
120+
configuration files to use, for example in the following way:
121+
122+
```kotlin
123+
tasks {
124+
generateGdprDocumentation {
125+
markdownReport = file("docs/gdpr/gdpr-documentation.md")
126+
additionalGdprDataFiles.setFrom(
127+
fileTree("src/main/resources") { include("**/gdpr-documentation.yaml") },
128+
)
129+
}
130+
}
131+
```
132+
133+
By default, the output will be written to `build/reports/gdpr-documentation.md`, and no configuration files will be
134+
used.
135+
136+
### Generate the documentation
21137

22138
Generate the documentation by running:
139+
23140
```
24141
./gradlew generateGdprDocumentation
25142
```
26-
You find the documentation in `build/reports/gdpr-documentation.md`. It currently needs to be manually
27-
copied to `docs/gdpr-documentation.md`
143+
144+
You find the documentation in `build/reports/gdpr-documentation.md` unless you configured a different location (see
145+
above).
28146

29147
Make sure to enable PlantUML in your markdown renderer in your IDE to see the Data Flow Diagram.
30148
Backstage also supports PlantUML, so it should work there without additional setup.
@@ -34,54 +152,61 @@ Backstage also supports PlantUML, so it should work there without additional set
34152
### CI/CD pipeline
35153

36154
This plugin uses GitHub actions to build and deploy the plugin to the Gradle Plugin Portal.
37-
The workflow is defined in `.github/workflows/build-and-deploy.yaml`and triggered on every push
155+
The workflow is defined in `.github/workflows/build-and-deploy.yaml`and triggered on every push
38156
to the `main` branch and on every pull request.
39157

40158
### Dependabot
41-
This repository uses [dependabot](https://dependabot.com/) to keep dependencies up to date.
159+
160+
This repository uses [dependabot](https://dependabot.com/) to keep dependencies up to date.
42161
Dependabot is configured in `.github/dependabot.yaml`.
43162

44163
### Release process
45164

46-
The workflow uses the [release-please-action from Google](https://github.com/googleapis/release-please-action).
165+
The workflow uses the [release-please-action from Google](https://github.com/googleapis/release-please-action).
47166

48-
> Release Please assumes you are using Conventional Commit messages.
167+
> _release-please_ assumes you are using Conventional Commit messages.
49168
>
50169
> The most important prefixes you should have in mind are:
51170
>
52171
> fix: which represents bug fixes, and correlates to a SemVer patch.
53172
>
54173
> feat: which represents a new feature, and correlates to a SemVer minor.
55174
>
56-
> feat!:, or fix!:, refactor!:, etc., which represent a breaking change (indicated by the !) and will result in a SemVer major.
175+
> feat!:, or fix!:, refactor!:, etc., which represent a breaking change (indicated by the !) and will result in a SemVer
176+
> major.
57177

58-
When release-please detects one or more conventional commits, it will create or update a pull request.
178+
When release-please detects one or more conventional commits, it will create or update a pull request.
59179
Once the pull request is merged, the workflow will create a new release and deploy the plugin.
60180

61181
### Manually trigger the release process
62182

63183
All commits not following the conventional commit format will be ignored by the release-please-action.
64184
Examples include:
185+
65186
- updating documentation
66187
- merge dependabot updates
67188
- ...
68189

69190
To trigger the action simply create an empty commit following the conventional commit format, e.g.:
191+
70192
```
71193
git commit --allow-empty -m "fix: prepare next release. Update dependencies"
72194
git push
73195
```
74196

75197
### Secret management
76-
The required secrets for the GitHub actions are stored in the repository settings under "Secrets and variables" -> "Actions".
77-
Currently, they are not managed as code.
198+
199+
The required secrets for the GitHub actions are stored in the repository settings under "Secrets and variables" -> "
200+
Actions".
201+
Currently, they are not managed as code.
78202

79203
### Build and test the plugin
204+
80205
```
81206
./gradlew clean build
82207
```
83208
84-
### Build and the the example project
209+
### Build and test the example project
85210
86211
```
87212
cd test

plugin/build.gradle.kts

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717

1818
plugins {
1919
kotlin("jvm")
20+
kotlin("plugin.serialization") version "1.9.0"
2021
id("com.gradle.plugin-publish") version "1.3.1"
2122
id("com.gradleup.shadow") version "8.3.9"
2223
`java-gradle-plugin`
@@ -28,6 +29,7 @@ repositories {
2829

2930
dependencies {
3031
compileOnly(gradleApi())
32+
implementation("com.charleskorn.kaml:kaml:0.92.0")
3133
implementation(kotlin("stdlib"))
3234
implementation("io.github.classgraph:classgraph:4.8.162")
3335
implementation(project(":core"))
@@ -48,8 +50,9 @@ gradlePlugin {
4850
id = "cloud.rio.gdprdoc"
4951
implementationClass = "cloud.rio.gdprdoc.GdprDocumentationPlugin"
5052
displayName = "RIO GDPR documentation plugin"
51-
description = "Gradle plugin to generate data classification documentation (needed for the GDPR documentation) for your project based on annotations on data classes"
52-
tags.set(listOf("gdpr", "documentation","rio", "rio.cloud"))
53+
description =
54+
"Gradle plugin to generate data classification documentation (needed for the GDPR documentation) for your project based on annotations on data classes"
55+
tags.set(listOf("gdpr", "documentation", "rio", "rio.cloud"))
5356
}
5457
}
5558
}

plugin/src/main/kotlin/cloud/rio/gdprdoc/GenerateGdprDocumentationTask.kt

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@
1616

1717
package cloud.rio.gdprdoc
1818

19+
import cloud.rio.gdprdoc.additionalgdprdata.AdditionalGdprDataLoader
20+
import cloud.rio.gdprdoc.additionalgdprdata.AdditionalGdprDataMapper
1921
import cloud.rio.gdprdoc.annotations.GdprData
2022
import cloud.rio.gdprdoc.report.GdprDataItem
2123
import cloud.rio.gdprdoc.report.GdprItemId
@@ -31,6 +33,7 @@ import org.gradle.api.provider.Property
3133
import org.gradle.api.tasks.Classpath
3234
import org.gradle.api.tasks.Input
3335
import org.gradle.api.tasks.InputFiles
36+
import org.gradle.api.tasks.Optional
3437
import org.gradle.api.tasks.OutputFile
3538
import org.gradle.api.tasks.TaskAction
3639
import org.gradle.internal.extensions.stdlib.uncheckedCast
@@ -51,6 +54,10 @@ abstract class GenerateGdprDocumentationTask : DefaultTask() {
5154
@get:OutputFile
5255
abstract val markdownReport: RegularFileProperty
5356

57+
@get:InputFiles
58+
@get:Optional
59+
abstract val additionalGdprDataFiles: ConfigurableFileCollection
60+
5461
@TaskAction
5562
fun process() {
5663

@@ -90,6 +97,26 @@ abstract class GenerateGdprDocumentationTask : DefaultTask() {
9097
}
9198
}
9299

100+
val additionalGdprDataLoader = AdditionalGdprDataLoader()
101+
val additionalGdprDataMapper = AdditionalGdprDataMapper(classPathFiles, logger)
102+
additionalGdprDataFiles.files.forEach {
103+
logger.lifecycle("Loading additional GDPR data from file: ${it.absolutePath}")
104+
val additionalGdprData = try {
105+
additionalGdprDataLoader.loadAdditionalGdprDataFromYamlFile(it)
106+
} catch (e: Exception) {
107+
logger.warn("Cannot read additional GDPR data from file ${it.absolutePath}: ${e.message}")
108+
return@forEach
109+
}
110+
val (additionalGdprDataItems, additionalLinks) = additionalGdprDataMapper.mapToGdprDataItems(
111+
additionalGdprData
112+
)
113+
additionalGdprDataItems.forEach { newItem ->
114+
gdprDataItems.removeIf { existingItem -> existingItem.id == newItem.id }
115+
gdprDataItems.add(newItem)
116+
}
117+
linkTargetClassesByItemId.putAll(additionalLinks)
118+
}
119+
93120
val gdprItemLinks = linkTargetClassesByItemId.flatMap { (source, targetClasses) ->
94121
targetClasses.flatMap { createLinks(sourceId = source, targetClassName = it, items = gdprDataItems) }
95122
}.toSet()
Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
/*
2+
* Copyright 2025 TB Digital Services GmbH
3+
*
4+
* Licensed under the Apache License, Version 2.0 (the "License");
5+
* you may not use this file except in compliance with the License.
6+
* You may obtain a copy of the License at
7+
*
8+
* http://www.apache.org/licenses/LICENSE-2.0
9+
*
10+
* Unless required by applicable law or agreed to in writing, software
11+
* distributed under the License is distributed on an "AS IS" BASIS,
12+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
* See the License for the specific language governing permissions and
14+
* limitations under the License.
15+
*/
16+
17+
package cloud.rio.gdprdoc.additionalgdprdata
18+
19+
import cloud.rio.gdprdoc.annotations.READ_MODEL_DEFAULT_RESPONSIBLE_FOR_DELETION
20+
import cloud.rio.gdprdoc.annotations.READ_MODEL_DEFAULT_RETENTION
21+
import cloud.rio.gdprdoc.model.PiiLevel
22+
import kotlinx.serialization.Serializable
23+
24+
@Serializable
25+
data class AdditionalGdprData(
26+
val classes: List<AdditionalGdprDataItem>,
27+
)
28+
29+
@Serializable
30+
data class AdditionalGdprDataItem(
31+
val className: String,
32+
val outgoing: Outgoing? = null,
33+
val incoming: Incoming? = null,
34+
val persisted: Persisted? = null,
35+
val readModel: ReadModel? = null,
36+
val fields: List<Field>,
37+
)
38+
39+
@Serializable
40+
data class Outgoing(
41+
val sharedWith: String,
42+
val why: String,
43+
val links: List<String> = emptyList(),
44+
)
45+
46+
@Serializable
47+
data class Incoming(
48+
val whereFrom: String,
49+
val whatToDo: String,
50+
val links: List<String> = emptyList(),
51+
)
52+
53+
@Serializable
54+
data class Persisted(
55+
val retention: String,
56+
val responsibleForDeletion: String,
57+
val links: List<String> = emptyList(),
58+
)
59+
60+
@Serializable
61+
data class ReadModel(
62+
val whereFrom: String,
63+
val whatToDo: String,
64+
val retention: String = READ_MODEL_DEFAULT_RETENTION,
65+
val responsibleForDeletion: String = READ_MODEL_DEFAULT_RESPONSIBLE_FOR_DELETION,
66+
val links: List<String> = emptyList(),
67+
)
68+
69+
@Serializable
70+
data class Field(
71+
val name: String,
72+
val level: PiiLevel,
73+
)

0 commit comments

Comments
 (0)