Skip to content

Conversation

@dkumanov
Copy link
Contributor

@dkumanov dkumanov commented Oct 9, 2019

No description provided.

@dkumanov dkumanov force-pushed the feature/add-filesize-validation branch from d88df55 to 76f1594 Compare October 9, 2019 14:08
@ivaylomitrev
Copy link
Contributor

Although the grouping is nice, let us not modify the IDs of existing tests as they lose their meaning then. Let's simply introduce this new test as AST23 and leave the old IDs intact.

I do not have any other comments.

@dkumanov
Copy link
Contributor Author

Reorder issue fixed and all commands tested.

<![CDATA[
SELECT COUNT(*) documents_with_mismatched_file_sizes
FROM arkivstruktur.dokumentobjekt
WHERE filstoerrelse <> _detected_file_size;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the variable name begins with underscore? The same is valid for all variables of the type as i see.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The column names starting with underscore show that the value is generated by the extraction validator and not present in the extraction files. In that case the _detected_file_size is the value of the actual file referenced on the system, not the value stored in arkivstruktur.xml.

@petterreinholdtsen
Copy link
Contributor

Any hope to have this change included in the source?

Instead of looking at the file size, it might be more useful to check the checksum of the content. If the size is different, the checksum will be different, but it will also detect incorrect content.

@ivaylomitrev
Copy link
Contributor

Any hope to have this change included in the source?

Instead of looking at the file size, it might be more useful to check the checksum of the content. If the size is different, the checksum will be different, but it will also detect incorrect content.

Hey,

There already is a test for the checksums - AST16: Tests whether the document object checksums specified in arkivstruktur.xml match the ones that the validator calculated using the SHA256 algorithm.

This is obviously a pretty old pull request and I do not remember why we may have left it hanging like so, but the presence of AST16 may be one of the reasons - after all, a checksum test will always be more accurate than a file size test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants