Convert webpage to image from ASP.NET

I would like to create a function in C# that takes a specific webpage and coverts it to a JPG image from within ASP.NET. I don’t want to do this via a third party or thumbnail service as I need the full image. I assume I would need to somehow leverage the webbrowser control from within ASP.NET but I just can’t see where to get started. Does anyone have examples?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Ok, this was rather easy when I combined several different solutions:

These solutions gave me a thread-safe way to use the WebBrowser from ASP.NET:

http://www.beansoftware.com/ASP.NET-Tutorials/Get-Web-Site-Thumbnail-Image.aspx

http://www.eggheadcafe.com/tutorials/aspnet/b7cce396-e2b3-42d7-9571-cdc4eb38f3c1/build-a-selfcaching-asp.aspx

This solution gave me a way to convert BMP to JPG:

Bmp to jpg/png in C#

I simply adapted the code and put the following into a .cs:

using System.Drawing;
using System.Drawing.Imaging;
using System.IO;
using System.Threading;
using System.Windows.Forms;

public class WebsiteToImage
{
    private Bitmap m_Bitmap;
    private string m_Url;
    private string m_FileName = string.Empty;

    public WebsiteToImage(string url)
    {
        // Without file 
        m_Url = url;
    }

    public WebsiteToImage(string url, string fileName)
    {
        // With file 
        m_Url = url;
        m_FileName = fileName;
    }

    public Bitmap Generate()
    {
        // Thread 
        var m_thread = new Thread(_Generate);
        m_thread.SetApartmentState(ApartmentState.STA);
        m_thread.Start();
        m_thread.Join();
        return m_Bitmap;
    }

    private void _Generate()
    {
        var browser = new WebBrowser { ScrollBarsEnabled = false };
        browser.Navigate(m_Url);
        browser.DocumentCompleted += WebBrowser_DocumentCompleted;

        while (browser.ReadyState != WebBrowserReadyState.Complete)
        {
            Application.DoEvents();
        }

        browser.Dispose();
    }

    private void WebBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {
        // Capture 
        var browser = (WebBrowser)sender;
        browser.ClientSize = new Size(browser.Document.Body.ScrollRectangle.Width, browser.Document.Body.ScrollRectangle.Bottom);
        browser.ScrollBarsEnabled = false;
        m_Bitmap = new Bitmap(browser.Document.Body.ScrollRectangle.Width, browser.Document.Body.ScrollRectangle.Bottom);
        browser.BringToFront();
        browser.DrawToBitmap(m_Bitmap, browser.Bounds);

        // Save as file? 
        if (m_FileName.Length > 0)
        {
            // Save 
            m_Bitmap.SaveJPG100(m_FileName);
        }
    }
}

public static class BitmapExtensions
{
    public static void SaveJPG100(this Bitmap bmp, string filename)
    {
        var encoderParameters = new EncoderParameters(1);
        encoderParameters.Param[0] = new EncoderParameter(System.Drawing.Imaging.Encoder.Quality, 100L);
        bmp.Save(filename, GetEncoder(ImageFormat.Jpeg), encoderParameters);
    }

    public static void SaveJPG100(this Bitmap bmp, Stream stream)
    {
        var encoderParameters = new EncoderParameters(1);
        encoderParameters.Param[0] = new EncoderParameter(System.Drawing.Imaging.Encoder.Quality, 100L);
        bmp.Save(stream, GetEncoder(ImageFormat.Jpeg), encoderParameters);
    }

    public static ImageCodecInfo GetEncoder(ImageFormat format)
    {
        var codecs = ImageCodecInfo.GetImageDecoders();

        foreach (var codec in codecs)
        {
            if (codec.FormatID == format.Guid)
            {
                return codec;
            }
        }

        // Return 
        return null;
    }
}

And can call it as follows:

WebsiteToImage websiteToImage = new WebsiteToImage( "http://www.cnn.com", @"C:Some FolderTest.jpg");
websiteToImage.Generate();

It works with both a file and a stream. Make sure you add a reference to System.Windows.Forms to your ASP.NET project. I hope this helps.

UPDATE: I’ve updated the code to include the ability to capture the full page and not require any special settings to capture only a part of it.

Method 2

Good solution by Mr Cat Man Do.

I’ve needed to add a row to suppress some errors that came up in some webpages
(with the help of an awesome colleague of mine)

private void _Generate()
{
    var browser = new WebBrowser { ScrollBarsEnabled = false };

    browser.ScriptErrorsSuppressed = true;        //           <--

    browser.Navigate(m_Url);
    browser.DocumentCompleted += WebBrowser_DocumentCompleted;
}

Thanks Mr Do

Method 3

Here is my implementation using extension methods and task factory instead thread:

/// <summary>
    /// Convert url to bitmap byte array
    /// </summary>
    /// <param name="url">Url to browse</param>
    /// <param name="width">width of page (if page contains frame, you need to pass this params)</param>
    /// <param name="height">heigth of page (if page contains frame, you need to pass this params)</param>
    /// <param name="htmlToManipulate">function to manipulate dom</param>
    /// <param name="timeout">in milliseconds, how long can you wait for page response?</param>
    /// <returns>bitmap byte[]</returns>
    /// <example>
    /// byte[] img = new Uri("http://www.uol.com.br").ToImage();
    /// </example>
    public static byte[] ToImage(this Uri url, int? width = null, int? height = null, Action<HtmlDocument> htmlToManipulate = null, int timeout = -1)
    {
        byte[] toReturn = null;

        Task tsk = Task.Factory.StartNew(() =>
        {
            WebBrowser browser = new WebBrowser() { ScrollBarsEnabled = false };
            browser.Navigate(url);

            browser.DocumentCompleted += (s, e) =>
            {
                var browserSender = (WebBrowser)s;

                if (browserSender.ReadyState == WebBrowserReadyState.Complete)
                {
                    if (htmlToManipulate != null) htmlToManipulate(browserSender.Document);

                    browserSender.ClientSize = new Size(width ?? browser.Document.Body.ScrollRectangle.Width, height ?? browser.Document.Body.ScrollRectangle.Bottom);
                    browserSender.ScrollBarsEnabled = false;
                    browserSender.BringToFront();

                    using (Bitmap bmp = new Bitmap(browserSender.Document.Body.ScrollRectangle.Width, browserSender.Document.Body.ScrollRectangle.Bottom))
                    {
                        browserSender.DrawToBitmap(bmp, browserSender.Bounds);
                        toReturn = (byte[])new ImageConverter().ConvertTo(bmp, typeof(byte[]));
                    }
                }

            };

            while (browser.ReadyState != WebBrowserReadyState.Complete)
            {
                Application.DoEvents();
            }

            browser.Dispose();

        }, CancellationToken.None, TaskCreationOptions.None, TaskScheduler.FromCurrentSynchronizationContext());

        tsk.Wait(timeout);

        return toReturn;
    }

Method 4

There is a good article by Peter Bromberg on this subject here. His solution seems to do what you need…

Method 5

The solution is perfect, just needs a fixation in the line which sets the WIDTH of the image. For pages with a LARGE HEIGHT, it does not set the WIDTH appropriately:

    //browser.ClientSize = new Size(browser.Document.Body.ScrollRectangle.Width, browser.Document.Body.ScrollRectangle.Bottom);
    browser.ClientSize = new Size(1000, browser.Document.Body.ScrollRectangle.Bottom);

And for adding a reference to System.Windows.Forms, you should do it in .NET-tab of ADD REFERENCE instead of COM -tab.

Method 6

You could use WatiN to open a new browser, then capture the screen and crop it appropriately.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x