For a recent project, I’ve been working a bit with the new Open XML file format (to the point of having to actually write some Java code to manipulate native Office 2007 documents, if you can believe it). I dsicovered some interesting file handling behavior in Word 2007 that I’d like to share.
There are actually two new Word document types — the .docx and .docm files. Both of them are Open XML — the difference is that the .docm extension indicates that this file is allowed to have macro content. If you take a Word document with existing macro content and save it as a .docx, Word will literally strip out the macro content as it saves the file, leaving you a nice macro-free document. Pretty nifty.
Likewise, if you take a .docm file and simply rename it as .docx, Word will refuse to open the file (as it sees the verboten macro content). I personally think it would be more useful to just ignore that content, but this is an arguably correct behavior.
Where the weirdness comes in is that you can rename that Open XML Word file to be some other extension. Since an Open XML document is basically a collection of parts (XML files) in a package (ZIP archive), I quickly found (thanks to Peter) that Word will happily attempt to open a .zip file (and will successfully do it if it’s actually an Open XML package). So for fun, I just took a .docm with a live macro, renamed it to .zip, and tried to open it in Word. It opened up with no problems; the macro content was recognized (and had it been signed by a Trusted Publisher or been in a Trusted Location, I wouldn’t have even gotten the normal security warning).
So, this is my question: is this a feature or a bug? Me, I tend to think it’s a bug. Perhaps this is my UNIX background showing, but to me, if you’re going to insist on parsing file types by their file extension (which leads to a whole lot of extra programming and drudgery attempting to keep users from doing stupid/malicious things like I just did), then you’d better be strict about it. If the .docm feature is intended to be a strong feature, Word should, IMHO, only honor macro content in a file with a .docm extension.
I saw this behavior with Office 2007 Beta 2 Technical Refresh; I don’t know if it’s limited to Word or whether Excel and PowerPoint do it as well.