Regular expression helps you to search and replace a substring within another string. It can search substrings by using pattern matching . For eaxmple : if you want to search all the words starting with 'th' in another string then pattern matching will be very usefull. A few case where pattern matching is regulerly used are
- Verifications of valid email address,telephone,zip codes, registration numbers
- removing extra spaces from your content
- replacing a particuler word with another,e.g., all 'was' with 'is'
In this tutorial the use of regular expression will be shown from basic to intermediate level.
What is regular expression?
I will explain regular expression first. Suppose you you have a sentence
I have a car used to carry cariage to canada.
- '/car/' -- This will find the word car only
- '/car*/' -- This will find all the word starting with car,i.e., car,carry,cariage because the letter * means all and '/car*/' all words starting with 'car'.
-
'/ca(r|nada|)/' -- This will find the string 'car' and 'canada'. The symbol '|' means OR . Now in the example the part 'ca' is fixed and after 'ca' two patterns 'a' and 'nada' ar ORed . Thus is means start with 'ca' and then find 'r' or 'nada'.
- '/^[A-Z] /'-- this means find all letters which only containalphabets, i.e., Between A to Z , '^' sign signifies beginning of the word . Thus 'ABC' will be valid but 'aBC' will not.
- '/^[A-Za-z] /' -- this means find all words which are between A to Z and a to z. Thus 'aBc' will pass the test but 'a9B' will not.
- '/^[A-Za-z0-9] /' -- This means find all words which are between A to Z and a to z and 0-9.
-
[A-z0-9_\-\_] -- This means find all the words which are between A to z and a to z and 0-9 and also contain the symbols '.' or '_' or '-' . '\' is used to separate the symbols '.' , '-' and '_' . Thus 'Ab9_' will be valid but 'Ab#' will not be valid.
Now with this basic understanding of regular expression we can move on to PHP programming. In PHP there are functions to handle regular expressions . These are
preg_replace()
preg_match()
preg_match_all()
preg_replace_all()
So, for example:
/car/ (note the beginning and closing / for delimiters, these enclose your expression)
indicates that the regular expression is looking for the letters "car". So, if a sentence looked like such:
I have a car used to carry cariage to canada.
A match would be found, and the match would return "car".
Alternatively, if we had put the word carriage into the sentence instead of car, a match would still be made, but it would only return the word "car" since that is what was requested in the regular expression.
Try it yourself:
<?php
preg_match ('/car/', 'I have a car used to carry cariage to canada.', $output);
echo $output[0];
?>
Output : car
Now what if we want to find the words which starts with 'car', then we have to use the pattern '/car.*/ '. Here '*' means all.
So the code becomes
<?php
preg_match ( '/car.*/' , 'I own a carnival now.' , $output );
echo $output [ 0 ];
?>
Output : carnival now
So, one more example:
/ca(r|anada|)/
indicates that the regular expression is looking for the letters "car" AND also looking for the word "canada". So, if a sentence looked like this:
I own a car now.
A match would be found, and the match would return "car". Alternatively, if the word car had been replaced with "canada", a match would have been made and the words "canada" would have been returned. This is because the regular expression requested a match starting with "ca" and either "r" or "nyon" ending the match.
Try it yourself:
<?php
preg_match ( '/ca(r|nyon|)/' , 'I own a car now.' , $output );
echo $output [ 0 ];
preg_match ( '/ca(r|nada|)/' , 'I am in canada now.' , $output );
echo $output [ 0 ];
?>
Output : carcanada
Getting the domain name out of a URL
<?php
// get host name from URL
preg_match ( "/^(http:\/\/)?([^\/]+)/i" , "http://www.koderguru.com/index.html" , $matches );
$host = $matches [ 2 ];
// get last two segments of host name
preg_match ( "/[^\.\/]+\.[^\.\/]+$/" , $host , $matches );
echo "domain name is: {$matches[0]}\n" ;
?>
Output : domain name is: koderguru.com
Having read the introduction, you should have an understanding of what regular expressions are. In the following example, the versatility of regular expressions is illustrated Remember, regular expressions are tools that can be used in a variety of ways, not just those illustrated here. They can make otherwise tedious and lengthy jobs a breeze. At the end of the tutorial, the regular expressions are matched with PHP in context to show how they would be used.
E-mail Validation
Reguler expressions are mostly used for the purpose of email validation. The term email validation signifies whether the email submited by user follows the format of the valid email , i.e., whether it is in the format of johndoe@example.com. Strictly speaking it does not verify whether the email account exists or not, it only verifies the format of the email is correct or not.
Code Flow
Assign the regular expression output to a variable ($validemail).
Invoke the preg_match function in order to match the objects in the target file with the desired validation parameters.
<?php
$emailfield='johndoe@example.com';
$validemail = preg_match('/^[A-z0-9_\-\.]+[@][A-z0-9_\-]+([.][A-z0-9_\-])+[A-z]{2,4}$/',$emailfield);
?>
explanetion of the pattern
begin with a delimiter /
then indicate the beginning of the line with ^
[A-Za-z0-9_\-] is any character A-Z, a-z, 0-9 and _ or - .
Then, indicate that this pattern is one or more with the + symbol.
Then, just add a [@] after the plus to look for the @ symbol in the e-mail address (it is a must for emails).
Now all you need is to repeat your previous criteria for matching (text between A and Z or the numbers 0 - 9)
Adding a () around the next subset at the additional [.] tells it to look for more text following the . (the .com,.net,etc)
Then adding a minimum/maximum bracket {2,4} tells it the text after '.' must be minimum 2 character wide and maximum 4 characters wide(i.e. .de, .au, etc)
Finally, the $ indicates the end of the target string
And the expression is closed with our ending delimiter /
Scripts
E-mail Validations (email.php)
<?php
if (isset($_POST['submit'])){
$emailfield=$_POST['emailfield'];
echo $emailfield;
$okay=preg_match('/^[A-z0-9_\-\.]+[@][A-z0-9_\-]+([.][A-z0-9_\-])+[A-z]{2,4}$/',$emailfield);
if($okay){
echo "E-mail is validated";
}else{
echo"E-mail is incorrect";
}
}
?>
<form method="POST" action="email.php">
E-mail address: <input type="text" name="emailfield">
<br><input type="submit" name="submit" value="Validate">
</form>
Back to Tutorial Index page
|