Gone Phishing - Email Header and Body Analysis

Intro

In this article, I will investigate how you can analyse an email, and confirm if it’s a phishing/scam/spam email. The idea here is to look at some tools that you might use for this, as well as how to open the attached file, that is most likely malicious, in a safe (sandboxed) way so you can learn what it does. Unfortunately, my example doesn’t contain an attachment, but I will look into that anyway.

With that being said, let’s dive into it!

Email Headers

Now that we know how an email travels, let’s look more closely at the components that make up the said email. This is more of a manual approach, but my rationale behind is that we should first understand what we need to do here, before using any specific tooling. You should be able to analyze the suspicious email manually – I don’t see how you would do so with a tool without understanding what’s going on first.

So! We have:

The Header (which contains information about the email – for example, which servers relayed the email)

The Body (which is the email’s text, usually HTML formatted but it can be in regular plaintext)

Let’s look at email header fields:

From – sender’s address
Subject – subject line of the email
To – recipient’s address
Date – when the email was sent

These headers are what you can see in your email client easily.

(I will use a same email example for this article, keep in mind this was a real scam/spam/phishing email I’ve received couple months back)

Points 1, 2, 3, and 4 are what’s explained above. Note that 3. (To header) says the email’s been sent to recipients i.e., my email address is omitted. This can be done in a couple of ways, but the gist of it is that its due to the SMTP protocol and how it works; more precisely put, due to the Internet standards in regards to emails – RFC821, RFC822, and RFC2821. SMTP would deliver to the RCPT TO, while TO, CC, and BCC are what the email claims where the message is sent. They are in fact optional and can thus be altered.

Going back to our case, when viewing the message raw (even better if you extract the .eml file, like in the image below):

When I check the headers through https://mailheader.org

There’s a couple of things to unpack here, and I’ll circle back to it. Before going into any of that here’s another useful link on headers – Understanding an email header. Of course, there’s many more headers to an email, and I’ve added some links to resources at the end of the article.

Email Body

The email body contains the text (either plaintext or HTML formatted). My example email in text only format:

You can view source code as well as the rendered HTML. Header that will be associated with an attachment is usually Content-Type and Content-Disposition. The Content-Type header would say something like pdf/application, and Content-Disposition would say attachment.

I don’t have an attachment, so for me it says Content-Type: text/plain. Content-Transfer-Encoding is also an important one, as it will tell you if it is encoded, and with what encoding.

Before we end this little detour and return to our case, I’d like to emphasize that headers that relate to content can be found in different locations within an email message source. As we’ve seen above, they are not only associated with attachments. They can be text/html (Content-Type), and Content-Transfer-Encoding can be base64, 8bit, etc.

Example Email – analysis

Now that I’ve covered some basics as to what you should be looking at, let’s use some tools to do the same analysis more efficiently. Generally, you don’t need tools (you can already analyze everything with your email/web client) but sender’s IP and reply-to information is only visible through the header.

First tool I wanted to look at is Google’s Messageheader:

Which is great for some quick analysis (more results will be found in the fields below the output above)

However, I slightly prefer the https://mha.azurewebsites.net/ personally:

*Ideally you would use different tools, as you might uncover some additional things that way, its good to mix it up a bit

In the first image (mail header analysis from the Email Headers section) you probably noticed the Reply-to address which you can see above as well.

The email that I received, that’s from an email address sesdep2@kemenpora.go.id actually routes back to wetttttwwttrttr@yandex.com (Reply-To header). Quick Googling has shown that kemenpora.go.id is …The Ministry of Youth and Sports of the Republic of Indonesia (Kemenpora) is a ministry within the Government of Indonesia that is in charge of youth and sports affairs of Indonesia…

Which our first image confirms (Mail server IP and Mail Country from), but the reply would go to an email at Yandex.com.

If I was to continue down this road, the next step would be to investigate the sender’s IP. However, we noticed here that the email is spoofed, so it’s sort of a moot point. The reason for this is because we don’t have any X-Originating-IP or Original-IP headers in the email. However, even if we did, it might not mean all that much as the way threat actors usually go on about this is to utilize a botnet which will make the same email come from different IP addresses.

Concluding our case here, we’ve confirmed the (in)authenticity of the email in question, and acted appropriately (they mentioned in the body to mail them at another address @kakao.com domain, but I didn’t investigate that mailbox beside running it through the MX Toolbox – and a glance was enough to realize something’s phishy)

IP Address, Body, and Sandbox

The Email body is important because it’s usually where the payload will reside (in a form of a link or attachment). You can right click and copy the link location, extract it from the raw header directly, or use something like this – (this is an awesome tool, with much more functionalities, for our case scroll down to data and choose URL extractor).

*You can use https://cyberchef.io/ Extract URLs recipe too

You can also use Email extractor, to extract all the emails found in the raw message headers.

Email extractor example

File Reputation and Sandbox

If you ever want to open/validate those suspicious attachments/files its quite important to know how to do so safely. One such tool can be Talos File Reputation.

On their site, its stated that “The Cisco Talos Intelligence Group maintains a reputation disposition on billions of files. This reputation system is fed into the Cisco Secure Firewall, ClamAV, and Open-Source Snort product lines. The tool below allows you to do casual lookups against the Talos File Reputation system. This system limits you to one lookup at a time, and is limited to only hash matching.”

The other one is the ubiquitous VirusTotal, where you can “Analyze suspicious files and URLs to detect types of malware, automatically share them with the security community.”

I need to mention that you don’t need leet skillz to understand the malware! There are online malware sandboxes that do that for you! Imagine having a malicious .pdf that you want to understand; well, by uploading to one of the platforms you can observe its behavior (URLs it tries to contact, IOCs, and a myriad of other nasty things).

Any.Run – “Analyze a network, file, module, and the registry activity. Interact with the OS directly from a browser. See the feedback from your actions immediately”.

Hybrid Analysis – “This is a free malware analysis service for the community that detects and nalyses unknown threats using a unique Hybrid Analysis technology.”

Joe Security – “Deeply analyze URLs to detect phishing, drive by downloads, tech scam and more. Joe Sandbox uses an advanced AI based algorithm including template matching, perptual hashing, ORB feature detection and more to detect the malicious use of legit brands on websites. Add your own logos and templates to extend the detection capabilities.”

IP Address

As you saw, this one is quite tricky! And even if you were to find a related IP (which is tough) it would kinda block you when you would find it being an ec2 instance. This is a venue that’s tough to pursue, and there’s too many ways to bypass it. Its not impossible, though. I am leaving some further links at the end of the article.

Conclusion

Okay! That’s been quite a ride, and there’s a lot more to unpack there, as we saw ‘simple’ phishing has quite a bit going for it in the background. Some of the stuff mentioned might even be out of scope (a SOC analyst at an enterprise level might just need to check the email authenticity, given that it slipped through the filters, and they are not going to go on a further hunt), but I felt it was necessary to at least try and outline what goes into it. Think of this article as a teaser on the topic.

Also, my case here wasn’t a typical phishing email, it was more of a scam, but I picked it for the spoofed email address. You can really analyze just about any email, and I just hope I’ve managed to describe what was my intent.

Lastyl, stay safe, validate emails, check those headers, and hover! Also, never click immediately or react on anything without thinking on it – this is a basic human bias (we want to resolve stuff, and they deftly prompted us with a challenge, albeit a fake one), and exactly what threat actors abuse. Some would say think before you act.

Additional Resources, Tools, Links

Bonus – How to extract an .eml for viewing

This is ridiculously simple, in fact! Let’s look at an example mail (that I got from Coursera)

1. – Open your email client, and create a new message

2. – Drag the email you’re interested into the body of the new email:

3. – You should see an attachment added to the email:

Notice the name of the attachment

4. – Download the attachment to get the .eml file:

5. – Open the .eml to view raw headers, investigate, find bad guys, confirm stuff, and, most importantly, have some fun!

You now have all the headers!

Cover by Mohamed Hassan

#phish #eml #headers #body #malware #sandbox #pdf

About Version 2 Digital

Version 2 Digital is one of the most dynamic IT companies in Asia. The company distributes a wide range of IT products across various areas including cyber security, cloud, data protection, end points, infrastructures, system monitoring, storage, networking, business productivity and communication products.

Through an extensive network of channels, point of sales, resellers, and partnership companies, Version 2 offers quality products and services which are highly acclaimed in the market. Its customers cover a wide spectrum which include Global 1000 enterprises, regional listed companies, different vertical industries, public utilities, Government, a vast number of successful SMEs, and consumers in various Asian cities.

About VRX
VRX is a consolidated vulnerability management platform that protects assets in real time. Its rich, integrated features efficiently pinpoint and remediate the largest risks to your cyber infrastructure. Resolve the most pressing threats with efficient automation features and precise contextual analysis.

vRx 2022