Posted on Wednesday, May 28th 2008 at 12:21

Parsing Strings With jQuery

13 comments so far | Digg | del.icio.us

Regular Expressions is a powerful tool when parsing and validating strings. And combining regular expressions with the simplicity of jQuery selectors can create some fast and useful string parsers. This post will show you a couple of really useful parsers that you can use in various environments, or as a base to create your own.

Regular expressions introduction

Virtually any modern programming language supports regular expressions. Regular expressions are used to parse or match strings with patterns using a certian language that the program intercepts and parses to generate a result. From wikipedia:

In computing, regular expressions provide a concise and flexible means for identifying strings of text of interest, such as particular characters, words, or patterns of characters. Regular expressions (abbreviated as regex or regexp, with plural forms regexes, regexps, or regexen) are written in a formal language that can be interpreted by a regular expression processor, a program that either serves as a parser generator or examines text and identifies parts that match the provided specification.

This article is not meant to be used as a learning tool for how to use regular expressions, instead I will show you how to integrate regular expressions with jQuery to create powerful string parsers.

If you’d like to view the examples right away, here is a HTML sample file.

Extending jQuery with chainable plugins

By offering a simple mechanisms for creating plugins, jQuery can easily be extended with custom methods. We are going to take advantage of that in this article by creating chainable jQuery methods that parses the selector’s html content. Let’s start with A bullet-proof frame for chainable jQuery plugins:



There. The first <script> line simply adds the jQuery core. The second script contains the plugin frame we are going to use. I always wrap jQuery methods and plugins in a private function that uses the $ to reference jQuery to avoid conflicts with other frameworks using the $ sign. Now, let’s add some regular expressions.

First example: strip HTML tags

As a simple first example, let’s create a simple method for stripping out HTML tags from the selector’s content. This is how I would like to use it later on:

$('p').stripHtml();

Ok, so let’s create the core functionality inside the plugin. First we assign a regex to a variable. This regular expression will match all HTML tags, including the tricky <span class=">">:

var regexp = /<("[^"]*"|'[^']*'|[^'">])*>/gi;

Note the gi after the ending slash - they stand for global and case insensitive, meaning that it will match every instance and ignore cases. Continuing with the parser, let’s apply this regex to the selector’s content:

$(this).html(
    $(this).html().replace(regexp,"")
);

Using our expression and the javascript replace method, we now replace the matching HTML tags with nothing (”"). Using the jQuery html() method, we also replace the old html content of the selector with the parsed string. All we have to do now is to wrap it up and add our code into the plugin frame:

(function($) {
    $.fn.stripHtml = function() {
        var regexp = /<("[^"]*"|'[^']*'|[^'">])*>/gi;
        this.each(function() {
            $(this).html(
                $(this).html().replace(regexp,”")
            );
        });
        return $(this);
    }
})(jQuery);

Second example: clickable URLs

Using the same principle, let’s create another plugin that matches all URL’s and replaces them with a wrapping anchor link:

$.fn.clickUrl = function() {
        var regexp = /((ftp|http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?)/gi;
        this.each(function() {
            $(this).html(
                $(this).html().replace(regexp,'$1‘)
            );
        });
        return $(this);
    }

Note the new regex that matches all URL’s inside the string and the $1 reference in the replace method that brings the first matching paranthese into the replaced string. In this example, we simply bring the matching url and wrap it inside an anchor tag. You might have to escape the quotes with backslashes.

Third example: escape HTML

Escaping HTML can be useful in some situations. As an example, I can definetely see the benefit of doing something like $('pre').escapeHtml() to render HTML tags correctly inside each preformated element. Doing so, there is no need to use a complicated regexp string, instead we can chain several replace methods to match and replace each instance of HTML 2 entities (a normal pre tag is used here to preserve character rendering):

$.fn.escapeHtml = function() {
        this.each(function() {
            $(this).html(
                $(this).html()
                    .replace(/"/g,"&quot;")
                    .replace(/</g,"&lt;")
                    .replace(/>/g,"&gt;")
                    .replace(/&/g,"&amp;")
            );
        });
        return $(this);
    }

Source

I have included all three examples with some simple test strings in a HTML sample file. Just view the HTML source to see how they are implemented and used.

Leave a Reply




Note: rel="nofollow" applied. Spammers step back.
No HTML allowed except <p> (paragraph)

13 Responses so far.

Permanent link At 9:17 am on June 2nd, 2008 , vvvlad wrote:

Really nice article!
I have a really hard time understanding regex :(

Could you please provide an example of how to write regex which will find all html tags, but one (for example - all but “”)
this would be really useful

Thanks

Permanent link At 4:06 pm on August 5th, 2008 , Roman wrote:

I need to figure out how to take some HTML formatted text (separated into multiple paragraphs text. I need to parse the html and put each paragraph into the array. Then I will use the first paragraph in the array as the Meta-Description tag for a page. I am having a really hard time selecting the text between the opening and closing p tags. Any suggestions?

Permanent link At 11:58 pm on September 11th, 2008 , Jess Mann wrote:

vvvlad:
Try this link: http://www.regular-expressions.info/tutorial.html

Roman:
This isn’t very difficult to do… but it won’t really accomplish much. Keep in mind the meta description isn’t useful for very much except for search engines… and search engines won’t parse javascript, meaning they’ll see an empty meta description using this approach.

I would try doing this with a different language, such as php, which would parse things server side and accommodate search engines.

If this is still something you’d like to do, post back here. I’ll try to check back soon.

Best of Luck,
-Jess

Permanent link At 6:42 am on December 11th, 2008 , Michael Edenfield wrote:

Thank you very much for this tutorial. It has been a very helpful learning tool for me in parsing out repetitive intro text from an RSS feed I use on my site. Thanks!

9 Trackbacks & Pings:

Permanent link Trackback at 10:30 pm on May 28th, 2008 by Sample Selectors unearthed. » Blog Archive » ’sample selector’ on the web:

[…] http://devkick.com/blog/parsing-strings-with-jquery/As a simple first example, let’s create a simple method for stripping out HTML tags from the selector’s content. This is how I would like to use it later on: $(’p’).stripHtml();. Ok, so let’s create the core functionality inside the … […]

Permanent link Trackback at 7:16 pm on May 31st, 2008 by This Month’s Best: May 2008 - Six Revisions:

[…] Parsing Strings With jQuery […]

Permanent link Trackback at 3:14 am on June 2nd, 2008 by Sample Selectors unearthed. » Blog Archive » What others have been saying about sample selector:

[…] http://devkick.com/blog/parsing-strings-with-jquery/We are going to take advantage of that in this article by creating chainable jQuery methods that parses the selector’s html content. Let’s start with A bullet-proof frame for chainable jQuery plugins: … […]

Permanent link Trackback at 2:56 am on June 15th, 2008 by Reader Pick: 12 Excellent Websites to Follow if You're into Web Design - Six Revisions:

[…] focuses on web development articles geared towards web designers. It covers topics such as Parsing strings with jQuery, web components/widgets for web designers/developers and inspirational posts such as Graphic Design […]

Permanent link Trackback at 12:01 pm on August 17th, 2008 by Email:luvAdobe@gmail.com » 12 Excellent Websites to Follow if You’re into Web Design:

[…] focuses on web development articles geared towards web designers. It covers topics such as Parsing strings with jQuery, web components/widgets for web designers/developers and inspirational posts such as Graphic Design […]

Permanent link Trackback at 10:18 pm on August 17th, 2008 by Readers Pick: 12 Excellent Websites to Follow if You’re into Web Design | [w3b]ndesign:

[…] focuses on web development articles geared towards web designers. It covers topics such as Parsing strings with jQuery, web components/widgets for web designers/developers and inspirational posts such as Graphic Design […]

Permanent link Trackback at 9:05 am on August 26th, 2008 by Readers Pick: 12 Excellent Websites to Follow if You’re into Web Design | Asktechman.com -Your Guide to best Internet Resources:

[…] focuses on web development articles geared towards web designers. It covers topics such as Parsing strings with jQuery, web components/widgets for web designers/developers and inspirational posts such as Graphic Design […]

Permanent link Trackback at 8:50 am on August 29th, 2008 by Learning MooTools: 20 MooTools Tutorials and Examples « Jonsunhee’s Weblog:

[…] focuses on web development articles geared towards web designers. It covers topics such as Parsing strings with jQuery, web components/widgets for web designers/developers and inspirational posts such as Graphic Design […]

Permanent link Trackback at 9:00 am on August 29th, 2008 by This Month’s Best: May 2008 « Jonsunhee’s Weblog:

[…] Parsing Strings With jQuery […]

DevKick News RSS

24 Kick Ass Portfolio Designs
exactly.
 17th of June at 9:07 pm
Using CSS to Fix Anything
Noupe shares some quick tips on how to avoid easy pitfalls when creating your CSS layout.
 17th of June at 11:14 am
2008 Design Trends
So what's hot now' Pencil sketches, handwritten notes, card stocks, watercolor effects, collage art, script fonts, grungy and splatter ink backgrounds etc... Some nice visual examples.
 16th of June at 10:04 am
10 Video Tutorials for Learning Basic Web Design Skills
Some people can read instructions on how to do things and can immediately go out and do them without any problems. But others need to see things done before they fully grasp how to do them.
 16th of June at 10:03 am
UTF-8: The Secret of Character Encoding
Character encoding and character sets are not that difficult to understand, but so many people blithely stumble through the worlds of programming without knowing what to actually do about it.
 12th of June at 11:17 am
jQuery UI v1.5 Released
"When we first started with the UI project, we set out to build a generic, basic, and simple way of adding and extending core interaction to DOM elements. However, we soon found that our approach wasn't working for UI."
 10th of June at 9:55 pm
The PHP Benchmark
The PHP Benchmark was constructed as a way to open people's eyes to the fact that not every PHP code snippet will run at the same speed. You may be surprised at the results that this page generates.
 9th of June at 1:27 pm
Introduction to CSS3 - What is it?
This article marks the first of several, providing an introduction to the new CSS3 standard which is set to take over from CSS2.
 9th of June at 9:24 am
Planning a Semantic Web site
This article leads you through the aspects of both information architecture and general infrastructure you need in place to truly take advantage of this burgeoning opportunity.
 8th of June at 5:49 pm
Why we skip Photoshop
7 reasons why the 37signals team skips photoshop.
 8th of June at 8:25 am

From the lab

Latest Components RSS

Component Categories