Skip to content

CybOX 3.0: HashType Refactoring

Ivan Kirillov edited this page Oct 28, 2015 · 26 revisions

Issue Description

There are several issues with the current structure for characterizing cryptographic hashes in CybOX, the HashType:

  • The structure is overly verbose and heavyweight for the capture and parsing of ubiquitous types of hash values such as MD5, SHA1, and SHA256; it is arguable that these are by far the most prevalent types of hashes in cyber threat related characterization today. Currently, users must first specify the correct value from the default HashNameVocab vocabulary, populate the Type field with this value and set its xsi:type to point to the vocabulary, and then finally populate the Simple_Hash_Value field with the actual hash value:
  <Type xsi:type="HashNameVocab-1.0">MD5</Type>
  <Simple_Hash_Value>3773a88f65a5e780c8dff9cdc3a056f3</Simple_Hash_Type>
  • The structure has separate fields for capturing simple and fuzzy hash values (Simple_Hash_Value and Fuzzy_Hash_Value, respectively), both fundamentally string values. This seems an unnecessary distinction, as simply specifying the type of a hash (e.g., SSDeep) provides the necessary context for identifying it as simple or fuzzy.

  • Patterning against the structure is semantically confusing, since a pattern must be written against both the Type and *_Hash_Value fields.

Refactoring

For the capture of ubiquitous hash values, we propose the creation of a new HashesType with the following fields. Note that this takes all of the values from the existing HashNameVocab-1.0, thus permitting the deprecation of this vocabulary:

Field Type Description
md5_value string Specifies an MD5 hash value.
md6_value string Specifies an MD5 hash value.
sha1_value string Specifies a SHA1 hash value.
sha256_value string Specifies a SHA256 hash value.
sha224_value string Specifies a SHA224 hash value.
sha384_value string Specifies a SHA384 hash value.
sha512_value string Specifies a SHA512 hash value.
ssdeep_value string Specifies an ssdeep fuzzy hash value.

Accordingly, we propose renaming the existing HashType to CustomHashType (and the existing HashListType to CustomHashesType) so that it is used exclusively for custom/non-standard hash values and has the following fields:

Field Type Description
type string Specifies the name of type cryptographic hashing algorithm used to generate the value captured in the hash_value field.
hash_value string Specifies a single cryptographic hash value, of the type defined in the type field.
fuzzy_hash_structure FuzzyHashStructureType Enables the characterization of the key internal components of a fuzzy hash calculation with a given block size.
Example
{
  "file" : {"hashes" : {"md5_value":"3773a88f65a5e780c8dff9cdc3a056f3"},
            "custom_hashes" : [{"type" : "superhash",        
                                         "hash_value":"f49125dac3:352bb35ffrca2:a123dc4599245"}]
           }
}

Impact

Each existing CybOX Object that uses the HashListType (full list below) will need to be updated as a result of this change. This will likely entail updating the existing hashes to use the new HashesType and the addition of a new custom_hashes field for the capture of custom hash values.

The only discrepancy with this approach is that the StreamType in the Windows File Object is currently an extension of the HashListType. Accordingly, it will no longer be an extension of the HashListType, and instead will have the new hashes and custom_hashes fields added (as with the other Objects).

Object List

Object Field
Artifact Hashes
File Hashes
Memory Hashes
PDF File Hashes
PDF File Hashes
PDF File Hashes
Win Executable File Hashes
Win Executable File Hashes
Win Executable File Header_Hashes
Win Executable File Data_Hashes
Win Executable File Hashes
Win Executable File Hashes
Win Executable File Hashes
Win Service Service_DLL_Hashes
Win Task Exec_Program_Hashes
Clone this wiki locally