Surendra Sharma

Surendra Sharma

Search This Blog

Sunday, March 24, 2019

Code : Get any attribute value from HTML string in C#


If you have HTML string in C# and you want to get any particular value of any attribute then you can use below function which refer Regex to get its value.
In below program I am extraing “src” attribute value of HTML “img" tag.

Input string : <img alt="" src="/MediaCenter/PublishImages/DSC_0134.jpg" width="150" style="border:0px solid" />

Output : /MediaCenter/PublishImages/DSC_0134.jpg

Code:

using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;

namespace Rextester
{
    public class Program
    {
        public static void Main(string[] args)
        {
            string s = "<img alt=\"\" src=\"/MediaCenter/PublishImages/DSC_0134.jpg\" width=\"150\" style=\"border:0px solid\" />";

            var srcs = GetSrcInHTMLImgString(s);
            Console.WriteLine(srcs[0]);//Output: /MediaCenter/PublishImages/DSC_0134.jpg
        }

        public static List<string> GetSrcInHTMLImgString(string htmlString)
        {
            List<string> srcs = new List<string>();
            string pattern = @"(?<=src="").*?(?="")";
           
            Regex rgx = new Regex(pattern, RegexOptions.IgnoreCase);
            MatchCollection matches = rgx.Matches(htmlString);

            for (int i = 0, l = matches.Count; i < l; i++)
            {
                string d = matches[i].Value;
                srcs.Add(d);
            }
            return srcs;
        }
    }
}

We can make this function more generic where we will pass any attribute name and get the value as

public static List<string> GetAttributeNameInHTMLString(string htmlString, string attributeName)
{
    List<string> attributeValues = new List<string>();
    string pattern = string.Format(@"(?<={0}="").*?(?="")", attributeName);

    Regex rgx = new Regex(pattern, RegexOptions.IgnoreCase);
    MatchCollection matches = rgx.Matches(htmlString);

    for (int i = 0, l = matches.Count; i < l; i++)
    {
        string d = matches[i].Value;
        attributeValues.Add(d);
    }
    return attributeValues;
}

Calling this function as

string s = "<img alt=\"\" src=\"/MediaCenter/PublishingImages/DSC_0134.jpg\" width=\"150\" style=\"border:0px solid\" />";

var widths = GetAttributeNameInHTMLString(s, "width");
Console.WriteLine(widths[0]);   //Output : 150

var styles = GetAttributeNameInHTMLString(s, "style");
Console.WriteLine(styles[0]);   //Output : border:0px solid

Its a small function but very handy. 

To test it, you can use C# online compiler “.NET Fiddle” instead of creating console application in Visual Studio. Many times I am using these online tools for quick testing. 

Let me know if you have any better idea to get attribute value from HTML string.

Thursday, March 14, 2019

Sitecore Hackathon 2019 - Screenshot module


Happy to share that our RAS team participated in Sitecore Hackathon 2019 on Sat 2 Mar. We started our day at 6 AM and finished it around 10:30 PM.

Out of 6 different problem categories, we selected one of the category “Best enhancement to the Sitecore XP UI for Content Editors & Marketers” as our problem statement.

We have developed a Screenshot module for Sitecore Content and Experience Editor. It works like a charm.

Sharing below reference links for module video, Sitecore package and documentation.

I deliberately skipping all module details here in the hope that you will watch the video 😉.  Love to hear your comment(s), feedback(s) or suggestion(s).

RAS Team Sitecore Hackathon
RAS Team


Result will be announcements at the SUGCON EU 2019 on April 4th 2019 in London.

Till that time - stay tuned.