Logo         Community
  Trog


Copyright notice:

This article is copyright Melonfire, 2018. All rights reserved.

All source code, brand names, trademarks and other content contained herein is proprietary to Melonfire, 2018. All rights reserved.

Source code within this article is provided with NO WARRANTY WHATSOEVER. It is meant for illustrative purposes only, and is NOT recommended for use in production environments.

Copyright infringement is a violation of law.

Printed from http://www.melonfire.com/community/columns/trog/article.php?id=2



So What's A $#!%% Regular Expression, Anyway?!
Learn how to use regular expressions quickly perform search and replace operations.

Introduction

Ask any relatively-experienced *NIX user to list his top ten favorite things about the operating system, and you're almost certain to hear him mutter, somewhere between "99% uptime" and "remote system reboots", the phrase "regular expressions".

Ask any relatively-experienced *NIX user to list the ten things he hates most about the operating system, and somewhere between "zombie processes" and "installation", he's sure to spit out the phrase "regular expressions".

It's precisely this complex love-hate equation that spawned the idea for a tutorial on regular expressions - surely, went the reasoning, something that induced such strong emotions in normally hard-headed *NIX administrators was worthy of investigation. And so, regardless of whether you're new to regular expressions, or an old hand at putting them together, the next few pages should help you resolve your conflicted feelings on the subject. Hey - it *is* cheaper than therapy...

And First There Was Love...

Regular expressions, also known as "regex" by the geek community, are a powerful tool used in pattern-matching and substitution. They are commonly associated with almost all *NIX-based tools, including editors like vi, scripting languages like Perl and PHP, and shell programs like awk and sed. You'll even find them in client-side scripting languages like JavaScript - kinda like Madonna, their popularity cuts across languages and territorial boundaries...

A regular expression lets you build patterns using a set of special characters; these patterns can then be compared with text in a file, data entered into an application, or input from a form filled up by users on a Web site. Depending on whether or not there's a match, appropriate action can be taken, and appropriate program code executed.

For example, one of the most common applications of regular expressions is to check whether or not a user's email address, as entered into an online form, is in the correct format; if it is, the form is processed, whereas if it's not, a warning message pops up asking the user to correct the error. Regular expressions thus play an important role in the decision-making routines of Web applications - although, as you'll see, they can also be used to great effect in complex find-and-replace operations.

A regular expression usually looks something like this:


/love/


All this does is match the pattern "love" in the text it's applied to. Like many other things in life, it's simpler to get your mind around the pattern than the concept - but then, that's neither here nor there...

How about something a little more complex? Try this:


/fo+/


This would match the words "fool", "footsie" and "four-seater". And although it's a pretty silly example, you have to admit that there's truth to it - after all, who but fools in love would play footsie in a four-seater?

The "+" that you see above is the first of what are called "meta-characters" - these are characters that have a special meaning when used within a pattern. The "+" metacharacter is used to match one or more occurrence of the preceding character - in the example above, the letter "f" followed by one or more occurrence of the letter "o".

Similar to the "+" meta-character, we have "*" and "?" - these are used to match zero or more occurrences of the preceding character, and zero or one occurrence of the preceding character, respectively. So,


/eg*/


would match "easy", "egocentric" and "egg"

while


/Wil?/


would match "Winnie", "Wimpy" "Wilson" and "William", though not "Wendy" or "Wolf".

In case all this seems a little too imprecise, you can also specify a range for the number of matches. For example, the regular expression


/jim{2,6}/


would match "jimmy" and "jimmmmmy!", but not "jim". The numbers in the curly braces represent the lower and upper values of the range to match; you can leave out the upper limit for an open-ended range match.


Copyright notice:

This article is copyright Melonfire, 2018. All rights reserved.

All source code, brand names, trademarks and other content contained herein and proprietary to Melonfire, 2018. All rights reserved.

Source code within this article is provided with NO WARRANTY WHATSOEVER. It is meant for illustrative purposes only, and is NOT recommended for use in production environments.

Copyright infringement is a violation of law.

Printed from http://www.melonfire.com/community/columns/trog/article.php?id=2



Copyright © 1998-2018 Melonfire. All rights reserved
Terms and Conditions | Feedback