EnCase Base64 Processing

The Easy Way!

 

Return to Main Forensics Help Page

 

Background:

MIME (Multipurpose Internet Mail Extensions) are specified in RFC 1341 and  RFC 2045.  Base64 is a coding scheme referenced in these RFC's and is used to convert binary files and attach the resulting code to email messages.  This conversion is done by converting the binary data to printable ASCII characters.  It uses an algorithm known as Base64 encoding to achieve this conversion.  If you want to read more on how this is done, I suggest going to http://media.it.kth.se/SONAH/ANALYSYS/acts/sonah/guide/mm_mail/mmm31.htm for a good explanation of the process.  A typical file will grow by app. 1/3 of its former size when it is converted to Base64.

A MIME file Base64 attachment in its pure form looks like this:

Content-Type: text/plain; charset=US-ASCII; name=mslinux.jpg
Content-transfer-encoding: Base64

/9j/4AAQSkZJRgABAQEAZABkAAD/2wBDAAkGBwgHBgkICAgKCgkLDhcPDg0N
DhwUFREXIh4jIyEeICAlKjUtJScyKCAgLj8vMjc5PDw8JC1CRkE6RjU7PDn/
2wBDAQoKCg4MDhsPDxs5JiAmOTk5OTk5OTk5OTk5OTk5OTk5OTk5OTk5OTk5

Note: Data between first and last three lines would normally appear here and is usually substantial.  For simplicity it was deleted!

yTVizvrqyYtbzNHnqByD+FQbG/un8qNjf3T+VU0mrMSbWqNV/Eepuu3zlX3V
BmsuSV5XLyOzueSzHJNJtb+6fyo2N/dP5VMYxjshylKW7Eyau2R/dH/eqntb
+6fyq7ZK3lH5T96lJ6BHc//Z

Note the MIME header information and note, especially, the first 17 characters of the data that is bolded.  This is a jpg image and recall that jpg's have a normal header of \xFF\xD8\xFF[\xFE\xE0]\x00 (Grep Expression).  When a jpg is converted to Base64, the header is converted as well and is then represented by printable ASCII characters.  Like the rest of the file, the header grows by 1/3 of its former size also.  The header for a jpg encoded with Base64 thus becomes:  /9j/4AAQSkZJRgABA

Recall from our EnCase Training that we would search for "Base64" and then go to the beginning of the data, sweep it to the end, bookmark the data, and view it as Base64.  Then we learned the faster way, which was to just sweep the first few bytes of the data, bookmark it, and view it as Base64.  Recently (Spring 2002), postings on the EnCase User Group suggested this or other techniques involving extractions of the data and external viewing, etc.  While this all works, it is slow, involves lots of steps, and just doesn't work well when you have hundreds of attachments to have to examine.  

Automated Method:

Note:  This semi-automatic method only works "manually" using Version 3.  Version 4 can't be used in this manner.  However, there is an EnScript on the GSI EnScript website that programmatically goes through this same methodology and works fine with Version 4.  The EnScript is among the Version 3 listings.  Download it and convert it to V4 EnScript and it runs fine.  If you are using Version 4, use the script, but understand from the below discussion what is occurring behind the scenes and why it works.

Applying the same "file signature" technique we use for looking for printing artifacts (looking for EMF file signatures and viewing hits as pictures) to Base64 encoding we can rapidly search for jpg's, gif's, zip's, doc's, etc. as Base64 attachments and view the images in the gallery view.  

Here's the process for finding jpg's:

Find your dbx or mbx file extensions for Outlook Express - Right click on them and choose "View File Structure" - This will open up your email files making them easy to view when image attachment hits are later discovered.
To organize your work and later identify the keyword, go to the "Keyword" view and create a folder called "Base64 jpg header"
Under this new folder, create a new keyword and enter the following text in the window:   /9j/4AAQSkZJRgABA
Don't check any of the boxes (Case Sensitive / GREP).  Check Unicode if you are dealing with Win2K or WinXP; otherwise you don't need to check it.  It won't hurt if you do, other than slow down your search.  Leaving it on may result in finding an attachment lurking in the swap file that is in Unicode.  It's a long shot, but you never know.
In the "View Bookmark As" window, under Pictures, highlight "Base64 Encoded Picture" - This will cause all search hits to be viewed using the Base64 Viewer and is therefore a critical step in making this process work.
Close out your new search term by clicking OK
Conduct a search on your entire case for the new search term.  It will find the attachments wherever they occur, including the ones in the unallocated clusters.
Go to your "Bookmark" View - Hit the green "homeplate" box for the folder you just created - Choose the "Gallery" view.  As hits start to occur, hit your refresh button and you should see images for the various Base64 encoded jpg's as they are found as shown below.  (For a larger view of the screen in the table view
If you find an image and you want to place the image/bookmark in your report, place the cursor on the gray box corresponding to the number of the bookmark and simply hold down the mouse button and drag it over the appropriate folder in your report.  It's now in your report!  If the image isn't visible in the report, you may need to edit the bookmark and make sure the "Show Picture" and "Show in Report" are selected.

In my experience and for reasons yet unknown, the EnCase internal viewer doesn't reveal an image for every valid set of Base64 codes.  This holds regardless of whether you do it this way or the slower way.  I have found that you can extract the dbx or individual eml files and still find more images with an external viewer.  Nevertheless, this process can still save you considerable work.  With this method, at least it will allow you to focus on searching for a specific format, usually jpg's instead of the keyword "Base64".  I'm sure something comparable can be done with UU Encoding with headers and EnCase's built-in viewer for this format.  It begs for some more R&D and some EScripts!

You can do the exact same process for gif's.  The Base64 header for a gif is: R0lGOD  (Recall the regular header for a GIF is "GIF8[79]a")

If you are searching for encoded zip files, the header for zip's is:  UEsDB  (regular header is "PK"). Naturally you'll have to extract out the data and view it externally, but this is a rapid way of finding all zip file attachments and saves a lot of hunting through "Base64" keyword hits.  Since doc's, xls's, etc have headers, they will also have a distinct header in Base64 and other encoding schemes.  If you need to determine those headers, take several small doc's and, using UUDeview or mime64, encode them into Base64.  Examine the first part of the data where the header should be and comparing several encoded files will easily reveal the encoded header that is constant between the files.  

 

 

 

 

 

 

This web site was created to provide assistance to computer forensics examiners engaging in cyber-crime investigations.  This field is rapidly evolving and changing as technology marches forward.  It is, therefore, intended to be a growing and evolving resource.  As you conduct your examinations and investigations, if you encounter information, links, or have suggestions that would help others, please let me know so I can add it to this site.  My email address is sbunting@udel.edu .  Thank you.

This site created and maintained by: 
Steve Bunting
Email: sbunting@udel.edu