Wednesday, October 15, 2008

emails, images, base64 and html

How many times have you received an image-laden email that can't quite show itself properly and instead you get the source? This is a recent email I received in Outlook:


Return-Path: <sender@gmail.com>
X-Original-To: you@somedomain.com
Delivered-To: you@somedomain.com
Received: from localhost (localhost [127.0.0.1])
.
.
. 
by Subject: More 3D Chalk Drawings by Julian Beever!
In-Reply-To: <BAY123-DS3D0865B1F4499DF30C37EA6310@phx.gbl>
MIME-Version: 1.0
Content-Type: multipart/related; 
 boundary="----=_Part_66037_8745117.1224059387776"
References: <BAY123-DS3D0865B1F4499DF30C37EA6310@phx.gbl>
To: undisclosed-recipients:;

------=_Part_66037_8745117.1224059387776
Content-Type: multipart/alternative; 
 boundary="----=_Part_66038_2852077.1224059387777"

------=_Part_66038_2852077.1224059387777
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Excellent, as usual!

------=_Part_66038_2852077.1224059387777
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

<div dir="ltr"><div class="gmail_quote"><br><br><br>
.
.
.
</div><br></div>

------=_Part_66038_2852077.1224059387777--

------=_Part_66037_8745117.1224059387776
Content-Type: image/jpeg; name=image008.jpg
Content-Transfer-Encoding: base64
Content-ID: <image008.jpg@01C92D89.A8F97520>
X-Attachment-Id: 0.8

/9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAoHBwgHBgoICAgLCgoLDhgQDg0NDh0VFhEYIx8lJCIf
.
.
.
d1qMmLqKKoagXqjRKiijQgpXJWePioorETGoxVvRRRUf/9k=
------=_Part_66037_8745117.1224059387776--

This is a mime-multipart html mail, that's got a few image/jpeg parts. To get the image(s) out of it, save it as .msg somewhere and open it with an editor (e.g. Notepad++). Look for the image part you're interested in:

Content-Type: image/jpeg; name=image008.jpg
Content-Transfer-Encoding: base64
Content-ID: 

Then strip everything off leaving only the base64-encoded image payload that appears beneath (the one that starts with /9j/ and ends in /9k= in our example). Save as say img-base64.txt. This should now look like:

/9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAoHBwgHBgoICAgLCgoLDhgQDg0NDh0VFhEYIx8lJCIf
IiEmKzcvJik0KSEiMEExNDk7Pj4+JS5ESUM8SDc9Pjv/2wBDAQoLCw4NDhwQEBw7KCIoOzs7Ozs7
...
JW+6Mj+7T9ixQoEAUFQTiqxPzn61zyep0QWhE4PULioGUk8mrjAbajYcGkiim2egpsSkvz0qcgUi
EjpSY0i1Gp2/KpxRUauxH3jRUG1j/9k=

This is your image, base64 encoded. The "save as .msg" bit was necessary as what's shown in Outlook is fiddled with and will not decode properly.

Now there are several options on how best to proceed. You may use Notepad++ builtin base64 decoding capabilities (TextFX, TextFX Tools, Base64 Decode) and save it as .jpg. Or, if Notepad++ is not available, you may use a command line utility for that, like the excellent base64 by John Walker.

Few people are aware though that the base64 payload can be used directly into html pages, letting the browser do all the hard work! The simplest way is putting the payload in an <img> element:


<img src="data:image/jpeg;base64,/9j/4AAQSkZJRgABAQEA...UG1j/9k="/>

Likewise in a CSS background:

div.image {
  background-image:url(data:image/jpeg;base64,/9j/4AAQSkZJRgABAQEA...UG1j/9k=);
}

This paradigm applies to other types of entities not just images. CSS stylesheets and javascript scripts can also appear as base64 encoded payloads in html pages. I will simply reiterate here two examples by Grey Wyvern:

<link rel="stylesheet" type="text/css" href="data:text/css;base64,LyogKioqKiogVGVtcGxhdGUgKioq..." />:  


<script type="text/javascript" href="data:text/javascript;base64,dmFyIHNjT2JqMSA9IG5ldyBzY3Jv..."></script>  

Now, I don't know why anyone would want to do that with javascript in particular, since base64 encoding bloats original size by a factor of 4/3, other than a perverse pleasure of tinkering about.