Skip to content

maugern/karton-email-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Email extractor Karton service

Performs extraction of RFC822 e-mail and attachments. Produces "raw" artifacts for further classification.

Minimalistic, use only the standard "email" python package.

Consumes:

{
    "type": "sample",
    "stage": "recognized",
    "extension": "eml"
},
{
    "type": "sample",
    "payload": {
        "magic": "SMTP mail,*"
    },
    "mime": [
        "message/rfc822"
    ]
}

Produces:

{
    "type": "sample",
    "kind": "raw"
}

Usage

First of all, make sure you have setup the core system: https://github.com/CERT-Polska/karton

Do not forget to add your karton.ini in this folder. Then, simply install the Karton dependency and run it.

$ python3 -m venv venv && source venv/bin/activate
$ pip install -r requirements.txt
$ python3 karton-email-extractor.py

Why do not use sflock instead ?

In theory the sflock used by karton-archive-extractor can extract eml, but there is some hardcoded stuff that I dont like in it:

  • It does not extract the attachments if the filename is empty.
  • "text/plain" and "text/html" are not extracted beacause hardcoded in a whitelist.
  • It decodes from Latin-1.

I also prefer to not extract images files, to limit the volume of data produced.

About

Extractor of email in rfc822 format for Karton framework

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages