Regular expression part 1 in java

In the previous post, we have seen the conditional statements. In this post, we will learn about the regular expression in java.

What is regular expression?

A regular expression is nothing but a String pattern which is used to search, edit or match text or data. The specified string pattern of regular expression when matches against text, the result may be a matched text or a set of matches in the text. This returns true if matches or false if not matches.

Java regex is the official Java regular expression API. It is found in java.util.regex package and has been included in java since java 1.4.

Core classes of Java Regex:

There are 2 core classes of java regex which are mentioned below.

  1. The Pattern Class (java.util.regex.Pattern)
  2. The Matcher Class (java.util.regex.Matcher)
  3. The PatternSyntaxException Class (java.util.regex.PatternSyntaxException)

Pattern class is used to create String pattern i.e. regular expression. It provides a pattern object which is a compiled expression representation of regular expression.

Matcher object is created by invoking matcher method of the Pattern object. Matcher object helps to identify the occurrence of the pattern in the text.

PatternSyntaxException notifies about the wrong pattern.

Another important aspect in regular expression is the syntax of regular expression. To learn regular expression , first we need to know syntax. There are a lot of syntax and we are not going to cover in very detail. Instead we will learn basics of syntax with examples which are commonly used. If you want more details, you can refer Pattern class java doc Page.

Syntax of regular expressions:

1. Characters:

Characters Description
\\ The backslash character
\t The tab character (‘\u0009’)
\n The newline (line feed) character (‘\u000A’)
\r The carriage-return character (‘\u000D’)
\f The form-feed character (‘\u000C’)
\e The escape character (‘\u001B’)

2. Character classes:

Characters classes Description
[abc] It is called the simple class.Matches a or b or c in the class.
[^abc] Matches any character except a or b or c.
[a-zA-Z] Matches character from a to z or A to Z. This is called a range.
[a-d[m-p]] Matched character from from a to d or from m to p. This is known as union.
[a-z&&[def]] Matches d or e or f. This is known as intersection(between a to z and def).
[a-z&&[^bc]] Matches from a to z except characters b and c. This is known as subtraction.
[a-z&&[^m-p]] Matches from a to z except from m to p. This is also known as subtraction.

3. Predefined Character Classes:

Predefined Characters classes Description
. Matches any single character. May or may not match line terminators.
\d Matches any digit from 0 to 9
\D Matches any non-digit character [^0-9]
\s Matches any white space character like space, tab, line break, carriage return etc.
\S Matches any non-white space character.
\w Matches any word character.
\W Matches any non-word character.

3. Boundary Matches:

Boundary Matches Description
^ Matches the beginning of a line.
$ Matches the end of a line.
\b Matches a word boundary.
\B Matches a non-word boundary.
\A Matches the beginning of the input text.
\G Matches the end of the previous match.
\Z Matches the end of the input text except the final terminator, if any
\z Matches the end of the input text.

4. Quantifiers:

Greedy Reluctant Possesive Description
X? X?? X?+ Matches X once, or not at all.
X* X*? X*+ Matches X zero or more times.
X+ X+? X++ Matches X one or more times.
X{n} X{n}? X{n}+ Matches X exactly n times.
X{n,} X{n,}? X{n,}+ Matches X at least n times.
X{n,m} X{n,m}? X{n,m}+ Matches X, at least n but not more than m time.

5. Logical Operators

Logical operators Description
XY X followed by Y
X|Y Either X or Y

We will see the rest of the regular expression in the part 2.

Ask Question
If you have any question, you can go to menu ‘Features -> Q&A forum-> Ask Question’.Select the desired category and post your question.

 

Avatar photo

Shekhar Sharma

Shekhar Sharma is founder of testingpool.com. This website is his window to the world. He believes that ,"Knowledge increases by sharing but not by saving".

You may also like...

1 Response

  1. August 10, 2015

    […] the part 1 , we have seen what is regular expression and its syntax. In part 2, we will mostly understand how […]