OpenXmlMemoryStream

Aug 18, 2014 at 1:14 PM
Edited Aug 23, 2014 at 6:45 PM
All,

When I started working with the DocumentBuilder, I somehow stumbled over the OpenXmlMemoryStreamDocument. The reason is that it somehow didn't seem as useful as it could be. For example, once you've called GetModifiedWmlDocument() (or any other of the similar methods), the MemoryStream contained in it can't be used anymore and you can't open any documents from it even if it is not yet closed/disposed. This essentially means that data must be copied much more often than I would want it to be copied at least in my application. While this doesn't say that you shouldn't be allowed to copy data as often as you like, I'd just don't want to be forced to copy larger Word documents, for example, for each processing step.

Enter OpenXmlMemoryStream and its subclasses WordprocessingMemoryStream, SpreadsheetMemoryStream, and PresentationMemoryStream. Using the WordprocessingMemoryStream as an example, this lets you perform multiple processing steps using the same stream, i.e., without copying data:
static void SimpleWordprocessingApplication()
{
    // Generate a Word document in multiple processing steps based on 
    // a "minimum document" created by WordprocessingMemoryStream.
    using (WordprocessingMemoryStream stream = 
        WordprocessingMemoryStream.Create())
    {
        // Perform a first processing step. Do whatever you like with the
        // WordprocessingDocument. This example just inserts a paragraph.
        // When leaving the scope of the using statement, the stream will
        // contain a perfectly fine WordprocessingDocument which we can
        // continue to process in further steps.
        using (WordprocessingDocument wordDoc = 
            stream.OpenWordprocessingDocument(true))
        {
            Document document = wordDoc.MainDocumentPart.Document;
            InsertParagraph(document.Body, "This is the first paragraph.");
        }

        // Perform a second processing step, using the same stream.
        // Again, do whatever you like with the WordprocessingDocument. We'll
        // just create another paragraph. When leaving the using statement, the
        // WordprocessingDocument will be closed, leaving us with a stream that
        // can be reused over and over again without copying any data.
        using (WordprocessingDocument wordDoc = 
            stream.OpenWordprocessingDocument(true))
        {
            Document document = wordDoc.MainDocumentPart.Document;
            InsertParagraph(document.Body, "This is the second paragraph.");
        }

        // Lastly, let's save the stream contents to a file.
        stream.SaveAs("Generated Document.docx");
    }

    static void InsertParagraph(Body body, string text)
    {
        Paragraph p = new Paragraph(new Run(new Text(text)));
        if (body.LastChild != null && body.LastChild is SectionProperties)
            body.LastChild.InsertBeforeSelf(p);
        else
            body.Append(p);
    }
}
OpenXmlMemoryStream is derived from MemoryStream, so all of these classes can be used like a MemoryStream. They are designed to be companions of the corresponding OpenXmlPackage children, i.e., WordprocessingDocument, SpreadsheetDocument, and PresentationDocument, and contain methods to:
  • create streams containing "minimum" documents as defined by the standard;
  • create stream instances from byte arrays, files, and other streams (e.g., to copy a stream if you want to copy); and
  • open documents from the stream.
OpenXmlMemoryStream includes Save() and SaveAs(string) methods to save the MemoryStream contents to a file.

I've also created two classes that provide PowerTools-related extensions (PtMemoryStreamExtensions) and factory methods (PtMemoryStreamFactory). The extensions let you create WmlDocument, SmlDocument, and PmlDocument instances from the respective stream instances. The factory methods allow you to create stream instances from WmlDocument, SmlDocument, and PmlDocument instances (although they are so simple that you could ask why they are required).

I'm proposing to replace the OpenXmlMemoryStreamDocument with this set of classes. I've published them as part of my Open XML Extensions project.

I'm very much interested in your thoughts and feedback.

Regards, Thomas