Lost image after convert docx to html.

Jun 25, 2009 at 8:55 AM

FIrst i have a document in word 2003 format., I use wordconv to convert it to docx.

then i use OpenXML Powertools to convert docx file to html. The result is all image in docx file was lost.

I check and release that in docx file was not using v:/drawing

How i can using wordconv to make the righ format ?

Any solution for my problem. thanks

May 9, 2013 at 7:27 AM
I am also facing same problem. Images are not converted while converting word document to HTML.
  1. I can not use the (.DOC) file for conversion to HTML - Exception - File contains corrupted data.
    So, I have to use the (.DOCX) files for conversion to HTML. Is this the limitation?
  2. If file contains some images it is not correctly converted. If I Open the HTML file in browser.I can not see the images
Following code I am using -

public static void ConvertToHTMLUsingOpenXml(string fUpload, string htmlFile, string imagePath)
   {

       string sourceDocumentFileName = fUpload;

       string imageDirectoryName = Path.GetFileNameWithoutExtension(sourceDocumentFileName) + "_files";

       int imageCounter = 0;

       byte[] byteArray = File.ReadAllBytes(sourceDocumentFileName);

       using (MemoryStream memoryStream = new MemoryStream())

       {

           memoryStream.Write(byteArray, 0, byteArray.Length);

           WordprocessingDocument doc =

               WordprocessingDocument.Open(memoryStream, true);

           {

               HtmlConverterSettings settings = new HtmlConverterSettings()

               {

                   PageTitle = Path.GetFileNameWithoutExtension(sourceDocumentFileName),

                   ConvertFormatting = true,

               };

               XElement html = HtmlConverter.ConvertToHtml(doc, settings,

                   imageInfo =>

                   {

                       DirectoryInfo localDirInfo = new DirectoryInfo(imagePath + imageDirectoryName);//Server.MapPath(HTML_FILE_PATH + imageDirectoryName));

                       if (!localDirInfo.Exists)

                           localDirInfo.Create();

                       ++imageCounter;

                       string extension = imageInfo.ContentType.Split('/')[1].ToLower();

                       ImageFormat imageFormat = null;

                       if (extension == "png")

                       {

                           extension = "jpeg";

                           imageFormat = ImageFormat.Jpeg;

                       }

                       else if (extension == "bmp")

                           imageFormat = ImageFormat.Bmp;

                       else if (extension == "jpeg")

                           imageFormat = ImageFormat.Jpeg;

                       else if (extension == "tiff")

                           imageFormat = ImageFormat.Tiff;

                        if (imageFormat == null)

                           return null;

                        string imageFileName = imagePath+ imageDirectoryName + "\\image" + // Server.MapPath(HTML_FILE_PATH + imageDirectoryName) + "/image" +

                            imageCounter.ToString() + "." + extension;

                       try

                       {

                           imageInfo.Bitmap.Save(imageFileName, imageFormat);

                       }

                       catch (System.Runtime.InteropServices.ExternalException)

                       {

                           return null;

                       }

                       XElement img = new XElement(Xhtml.img,

                           new XAttribute(NoNamespace.src, imageFileName),

                           imageInfo.ImgStyleAttribute,

                           imageInfo.AltText != null ?

                               new XAttribute(NoNamespace.alt, imageInfo.AltText) : null);

                       return img;

                   });

               File.WriteAllText(htmlFile, html.ToString());

           }

           memoryStream.Close();

       }

   }
Jul 23, 2013 at 7:04 AM
Edited Jul 23, 2013 at 7:22 AM
See changes, works for me. Thanks for posting the problem.
string docPath = @"~\SampleData\3DNAmethod.DOCX"; //doc lives here
string htmlPath = @"~\SampleData\3DNAmethod.html";   // write the html to here
string imgPath = @"~\SampleData\images";           // images will go here
string imgVirtualPath = "/SampleData/images/";     // virtual path to the images found.
ConvertToHTMLUsingOpenXml(Server.MapPath(docPath), Server.MapPath(htmlPath), Server.MapPath(imgPath), imgVirtualPath);

public static void ConvertToHTMLUsingOpenXml(string fUpload, string htmlFile, string imagePath, string imgVirtualPath)
{
    string sourceDocumentFileName = fUpload;
    string imageDirectoryName = Path.GetFileNameWithoutExtension(sourceDocumentFileName) + "_files";
    int imageCounter = 0;
    byte[] byteArray = System.IO.File.ReadAllBytes(sourceDocumentFileName);
    string returnString;
    using (MemoryStream memoryStream = new MemoryStream())
    {
        memoryStream.Write(byteArray, 0, byteArray.Length);
        WordprocessingDocument doc = WordprocessingDocument.Open(memoryStream, true);
        {
            HtmlConverterSettings settings = new HtmlConverterSettings()
            {
                PageTitle = Path.GetFileNameWithoutExtension(sourceDocumentFileName),
                ConvertFormatting = false,
            };

            XElement html = HtmlConverter.ConvertToHtml(doc, settings,
                imageInfo =>
                {
                    DirectoryInfo localDirInfo = new DirectoryInfo(imagePath + imageDirectoryName);//Server.MapPath(HTML_FILE_PATH + imageDirectoryName));
                    if (!localDirInfo.Exists)
                        localDirInfo.Create();
                    ++imageCounter;
                    string extension = imageInfo.ContentType.Split('/')[1].ToLower();
                    ImageFormat imageFormat = null;
                    if (extension == "png")
                    {
                        extension = "jpeg";
                        imageFormat = ImageFormat.Jpeg;
                    }
                    else if (extension == "bmp")
                        imageFormat = ImageFormat.Bmp;
                    else if (extension == "jpeg")
                        imageFormat = ImageFormat.Jpeg;
                    else if (extension == "tiff")
                        imageFormat = ImageFormat.Tiff;
                    if (imageFormat == null)
                        return null;
                    string imageFileName = imagePath + imageDirectoryName 
                        + "\\image" + imageCounter.ToString() + "." + extension;
                    string imageVirtualName = imgVirtualPath + imageDirectoryName 
                        + "/image" + imageCounter.ToString() + "." + extension;
                    try
                    {
                        imageInfo.Bitmap.Save(imageFileName, imageFormat);
                    }
                    catch (System.Runtime.InteropServices.ExternalException)
                    {
                        return null;
                    }

                    XElement img = new XElement(Xhtml.img,
                        new XAttribute(NoNamespace.src, imageVirtualName),
                        imageInfo.ImgStyleAttribute,
                        imageInfo.AltText != null ?
                            new XAttribute(NoNamespace.alt, imageInfo.AltText) : null);
                    return img;
                });
            System.IO.File.WriteAllText(htmlFile, html.ToString());
        }
        memoryStream.Close();
    }
}
Oct 19, 2013 at 7:53 PM
I have tried to use mentioned solution but still not able to see images on doc, it is showing place holder of image with "X" in red but not showing actual picture. if possible then can you please provide source code?