Validating Form fields with Regular Expression : preg_match()


Regular Expressions, commonly known as "regex" or "RegExp" are powerful pattern matching algorithm/function that can be performed in a single expression. The regular expressions are used for text processing and manipulations, for example to verify whether the format of data i.e. name, email, phone number, etc. entered by the user was correct or not, find or replace matching string within text content etc.

preg_match() is a built-in php function which is used to regular expression related tasks. It is used to perform a pattern match on a string. It returns true if a match is found and false if a match is not found.Ths basic syntax of preg_match() is as follows :
 preg_match('/pattern/', subject);
Where pattern specifies the pattern that needs to be match, and the '/.../' (forward slashes) denotes the beginning and end of regular expression, the subject specifies the text string to be matched against. For example :

Example 1 :
<?php

  $str = "Hello world. This is the Test String.";
  if(preg_match("/world/", $str)) {
    echo "The \$str contains the word 'Hello'";
  } else {
    echo "The \$str does not contains the word 'Hello'";
  }

?>
Output :

The $str contains the word 'Hello'

At the above example the function try to find the word or pattern "Hello"on the given string, and if the match is found then it will print the message.

Example 2 :
<?php

  $str = "mycsnotes.com";
  if(preg_match("/^[a-zA-Z0-9]+\.[a-zA-Z.]{2,6}$/", $str)) {
    echo "$str is a URL";
  } else {
    echo "$str is not a URL";
  }

?>
Output :

mycsnotes.com is a URL

At the above example the preg_match() function checks that weather the supplied sting is url or not, by using the metacharacters in pattern field.

The metacharacters simply allow us to perform more complex pattern matches such as test the validity of an email address. Some of the most commonly used metacharacters are as follows :

Meta- character Description Example
. Matches any single character except a new line /./ matches anything that has a single character
^ Matches the beginning of or string / excludes characters /^PH/ matches any string that starts with PH
$ Matches pattern at the end of the string /com$/ matches guru99.com,yahoo.com Etc.
* Matches any zero (0) or more characters /com*/ matches computer, communication etc.
+ Requires preceding character(s) appear at least once /yah+oo/ matches yahoo
\ Used to escape meta characters /yahoo+\.com/ treats the dot as a literal value
[...] Character class /[abc]/ matches abc
a-z Matches lower case letters /a-z/ matches cool, happy etc.
A-Z Matches upper case letters /A-Z/ matches WHAT, HOW, WHY etc.
0-9 Matches any number between 0 and 9 /0-4/ matches 0,1,2,3,4

For example at the above example, the preg_match function will be
 preg_match("/^[a-zA-Z0-9]+\.[a-zA-Z.]{2,5}$/", $str)
Where
  • ^ represents the beginning of the string.
  • [a-zA-Z0-9] represents the allowed characters.
  • + is used for denote precceding characters.
  • \. is used to denote the . at url like .com .edu  etc/ and \ (back slash) is used as escape for the . (dot) to not interpret it as a metacharacter.
  • [a-zA-Z.] is used for domain like com, org, edu etc. 
  • {2,5} denotes that the length of domain should be in between 2 to 5 characters.

The below example shows regular expression for email validation :
<?php

  $email = "admin_101@mycsnotes.com";
  if (preg_match("/^[a-zA-Z0-9._-]+@[a-zA-Z0-9-]+\.[a-zA-Z.]{2,5}$/", $email)) {
    echo "$email is a valid email address";
  } else {
    echo "$email is NOT a valid email address";
  }

?>
Output :

admin_101@mycsnotes.com is a valid email address

At the above example the preg_match() function will be :
 preg_match("/^[a-zA-Z0-9._-]+@[a-zA-Z0-9-]+\.[a-zA-Z.]{2,5}$/", $email)
Where :
  • ^ represents the beginning of the string.
  • [a-zA-Z0-9._-] represents the allowed characters, which are alphabet chars, numbers, . (single metacharacter), _ (underscore) and - (hyphen or desh).
  • + is used for denote precceding characters.
  • @ for email address
  • \. is used to denote the . at url like .com .edu  etc/ and \ (back slash) is used as escape for the . (dot) to not interpret it as a metacharacter.
  • [a-zA-Z.] is used for domain like com, org, edu etc. 
  • {2,5} denotes that the length of domain should be in between 2 to 5 characters.