Similar questions have been asked, but nothing exactly like mine, so here goes.
We have a collection of Microsoft Word documents on an ASP.NET web server with merge fields whose values are filled in as a result of user form submissions. After the field merge, the server must convert the document to PDF and stream it down to the browser. Our first inclination was to use the Visual Studio Tools for Office API; however, we ran into this warning from Microsoft:
Microsoft does not currently recommend, and does not support, Automation of Microsoft Office applications from any unattended, non-interactive client application or component (including ASP, ASP.NET, DCOM, and NT Services), because Office may exhibit unstable behavior and/or deadlock when Office is run in this environment.
It looks like the field manipulation can be done using the Open XML SDK, but what’s the best way to convert Word 2007 documents to PDF without opening Word? The optimal solution would be low-cost, scalable, have a low memory footprint, be easy to deploy, and have a .NET API.
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
It’s not exactly Open Source, but Aspose has a couple products which can do that,
Aspose.Pdf.Kit is a non-graphical PDF® document manipulation component that enables both .NET and Java developers to manage existing PDF files as well as manage form fields embedded within PDF files. Aspose.Pdf is perfect for creating new PDF files; however, developers often need to edit already existing PDF documents. Aspose.Pdf.Kit allows them to do just that. Aspose.Pdf.Kit allows developers to create powerful applications for merging data directly into PDF documents as well as for updating and managing PDF documents. Aspose.Pdf.Kit is a wonderful product and works great with the rest of our PDF products.
Aspose.Pdf is a non-graphical PDF® document reporting component that enables either .NET or Java applications to create PDF documents from scratch without utilizing Adobe Acrobat®. Aspose.Pdf is very affordably priced and offers a wealth of strong features including: compression, tables, graphs, images, hyperlinks, security and custom fonts. Aspose.Pdf supports the creation of PDF files through API, XML templates and XSL-FO files. Aspose.Pdf is very easy to use and is provided with 14 fully featured demos written in both C# and Visual Basic.
There’s also iTextSharp which is a C# port of iText, a Java PDF converter. I’ve heard some people try it with mixed results.
The question is “MS Word Documents to PDF in ASP.NET” so I am very puzzled why Aspose.Pdf and Aspose.Pdf.Kit are recommended above. You need to use Aspose.Words because that’s the component that supports Microsoft Word documents to PDF conversion.
Check out Microsoft’s resource on Saving Word 2007 Documents to PDF and XPS Formats using C# or VB.
ActivePdf DocConverter – http://www.activepdf.com/
But it requires Office installed on the server for good quality conversion.
Aspose.Words may be the best option for you, but it doesn’t convert all visual elements perfectly.
Have a look at the Muhimbi PDF Converter Web Services. It runs on Windows as a service, but can be accessed from any non-Windows web services capable environment including Java and .NET.
Although this solutions requires MS-Office to be installed on a server (not necessarily the same server as your application), it is very robust and provides perfect conversion fidelity. It goes to great lengths to get around the deadlock problems Microsoft refer to in their KB article.
Disclaimer, I worked on this product. Having said that, it works great.
You should try using OpenOffice for this. It is Free and supports a whole range of file conversions. I have used it to convert DOC & DOCX files to HTML format with fantastic results.
ABCpdf is another popular component that’ll let you convert Word documents to PDF under ASP.NET, however I believe it too makes use of Microsoft Office or OpenOffice.
Microsoft PDF add-in for word seems to be the best solution for now but you should take into consideration that it does not convert all word documents correctly to pdf and in some cases you will see huge difference between the word and the output pdf. Unfortunately I couldn’t find any api that would convert all word documents correctly. The only solution I found to ensure the conversion was 100% correct was by converting the documents through a printer driver. The downside is that documents are queued and converted one by one, but you can be sure the resulted pdf is exactly like the word docuemtn. I personally preferred using UDC (Universal document converter) and installed Foxit Reader(free version) on server too then printed the documents by starting a “Process” and setting its Verb property to “print”. You can also use FileSystemWatcher to set a signal when the conversion has completed.