Skip to content Skip to sidebar Skip to footer

How To Strip Out One Common Attribute From Every Form Element On The Page?

I have a string variable that contains an HTML page's response. It contains hundreds of tags, including the the following three html tags: <

Solution 1:

Look at Html Agility Pack.

Using regex:

(?<=<[^<>]*)\sprefix\w+="[^"]"\s?(?=[^<>]*>)

var result = Regex.Replace(s, 
    @"(?<=<[^<>]*)\sprefix\w+=""[^""]""(?=[^<>]*>)", string.Empty);

Solution 2:

RegEx is not the solution since HTML is not a regular language and as such shouldn't be parsed with RegEx's. I've heard good things about HTML Agility Pack for parsing and working with HTML. Check it out.

Solution 3:

var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(/* your html here */);
foreach (var item in doc.DocumentNode.Descendants()) {
    foreach (var attr in item.Attributes.Where(x =>x.Name.StartsWith("prefix")).ToArray()) {
        item.Attributes.Remove(attr);
    }
}

Solution 4:

html = Regex.Replace(html, @"(?<=<\w+\s[^>]*)\s" + Regex.Escape(prefix) + @"\w+\s?=\s?""[^""]*""(?=[^>]*>)", "");

You have a look behind and look ahead that will find , then you have a matcher for the prefix#####="?????".

Solution 5:

Here's the heavy handed method of doing it.

Stringstr = "<tag1 prefix131403013654=\"2\">"; 
            while (str.IndexOf("prefix131403013654=\"") != -1) //At least one still exists...
            {
               int point = str.IndexOf("prefix131403013654=\"");
               int length = "prefix131403013654=\"".Length;

               //need to grab last part now. We know there's a leading double quote and a ending double quote surrounding it, so we find the second quote.
               int secondQuote = str.IndexOf("\"",point + length); //second part is your positionif (str.Substring(point - 1, 1) == " ")
               {
                  str = str.Replace(str.Substring(point, (secondQuote - point + 1)),"");
               }
            }

edited for better code. Edited again after testing, added +1 to replace to count the final quote. It works. Basically you could encompass this in a loop that goes through an array list that has all "remove these" values in it.

If you don't know the full prefix's name you can change it up like so:

Stringstr = "<tag1 prefix131403013654=\"2\">"; 
            while (str.IndexOf("prefix") != -1) //At least one still exists...
            {
               int point = str.IndexOf("prefix");

               int firstQuote = str.IndexOf("\"", point);

               int length = firstQuote - point + 1;
               //need to grab last part now. We know there's a leading double quote and a ending double quote surrounding it, so we find the second quote.
               int secondQuote = str.IndexOf("\"",point + length); //second part is your positionif (str.Substring(point - 1, 1) == " ") //checking if its actually a prefix
               {
                   str = str.Replace(str.Substring(point, (secondQuote - point + 1)),"");
               }
               //Like I said, a very heavy way of doing it.
            }

That will catch all of them that start with prefix.

Post a Comment for "How To Strip Out One Common Attribute From Every Form Element On The Page?"