|
| 1 | +## Processing |
| 2 | + |
| 3 | +As seen in the example dataset, the majority of the data files will be hosted directly on our server. In order to still have some metadata to describe the contents of the data package, we'll create generalized representative entities to describe the files. |
| 4 | + |
| 5 | +### Understanding the contents of a large data submission |
| 6 | + |
| 7 | +First, we'll need to inspect the files and folders submitted by the PI. Ideally, the PI will have organized the files in a comprehensive folder hierarchy with a file naming convention. Asking the PI to provide the file and folder naming convention for all their submitted files is helpful, as it allows us to create one representative entity for each type of file they have. |
| 8 | + |
| 9 | +We'll also need the PI to submit a dataset, as normal, through the ADC. Since they'll have uploaded their files directly to our server, they won't need to upload any files to their metadata submission. |
| 10 | + |
| 11 | +### Creating representative entities |
| 12 | + |
| 13 | +To create representative entities, there are 2 ways: |
| 14 | + |
| 15 | +#### 1. Creating entities through R |
| 16 | + |
| 17 | +To create the representative entities through R, we will use `eml` library. In this example, we'll create a general `dataTable` entity: |
| 18 | + |
| 19 | +```{r, eval=FALSE} |
| 20 | +dataTable1 <- eml$dataTable(entityName = "[region_name]/[lake_name]_[YYYY-MM-DD].csv", |
| 21 | + entityDescription = "These CSV files contain lake measurement information. YYYY-MM-DD represents the date of measurement.") |
| 22 | +
|
| 23 | +doc$dataset$dataTable[[1]] <- dataTable1 |
| 24 | +``` |
| 25 | + |
| 26 | +Now, we can add an attribute table to this entity to document the files' metadata. |
| 27 | + |
| 28 | +```{r, eval=FALSE} |
| 29 | +atts <- EML::shiny_attributes() |
| 30 | +doc$dataset$dataTable[[1]]$attributeList <- EML::set_attributes(attributes = atts$attributes, factors = atts$factors) |
| 31 | +``` |
| 32 | + |
| 33 | +Similarly, we can create representative entities for an `otherEntity`, `spatialVector`, or `spatialRaster`. The required sections for each of these different types of entities may slightly differ, so be sure to check the [EML schema documentation](https://eml.ecoinformatics.org/schema/) so that all the required sections are added. Otherwise, the EML will not validate. |
| 34 | + |
| 35 | +For example, you'll need to have a coordinate reference system for a `spatialVector` or `spatialRaster`. |
| 36 | + |
| 37 | +#### 2. Creating entities through the ADC |
| 38 | + |
| 39 | +Another method to create representative entities is to |
| 40 | + |
| 41 | +1. Download one of each type of file from the dataset. |
| 42 | + |
| 43 | +2. Upload each file into the data package through the ADC web editor. |
| 44 | + |
| 45 | +3. Add file descriptions and attributes as normal. |
| 46 | + |
| 47 | +4. Change the `entityName` of the file to be more general, showing the file and folder naming convention. |
| 48 | + |
| 49 | +5. Before the dataset is published, remove any physicals, then remove the files from the data package using `arcticdatautils::updateResourceMap()`. This will remove the file PIDs from the resource map, but it will leave the entities in the EML document. Please ask a Data Coordinator if you have not used `updateResourceMap()` before. |
| 50 | + |
| 51 | +### Moving files to web-accessible location |
| 52 | + |
| 53 | +As mentioned in the section, "Uploading files to `datateam`", we'll need to move the files to their web-accessible location in `var/data/10.18739/preIssuedDOI`. |
| 54 | + |
| 55 | +### Adding Markdown link to abstract in EML |
| 56 | + |
| 57 | +We'll need to provide the link to the files that will be hosted on our server for viewers to download files. The link to the dataset will be: |
| 58 | + |
| 59 | +> http://arcticdata.io/data/preIssuedDOI |
| 60 | +
|
| 61 | +To add this as a clickable Markdown link in the Abstract section of an EML doc, we'll need to |
| 62 | + |
| 63 | +1. Open the EML doc in a text editor. |
| 64 | + |
| 65 | +2. Navigate to the <abstract> section. |
| 66 | + |
| 67 | +3. Underneath the <abstract> section, add a <markdown> ... </markdown> section. |
| 68 | + |
| 69 | +4. Add in Markdown-formatted text without any indentations. |
| 70 | + |
| 71 | +This will look like: |
| 72 | + |
| 73 | +``` |
| 74 | + <abstract> |
| 75 | + <markdown> |
| 76 | +### Access |
| 77 | +Files be accessed and downloaded from the directory via: [http://arcticdata.io/data/10.18739/DOI](http://arcticdata.io/data/10.18739/DOI). |
| 78 | + |
| 79 | +### Overview |
| 80 | +This is the original abstract overview that the PI submitted. |
| 81 | + </markdown> |
| 82 | + </abstract> |
| 83 | +``` |
0 commit comments