regular expression


Remove all non digit characters from String

The following Java snippet removes all non-digit characters from a String.
Non-digit characters are any characters that are not in the following set [0, 1, 2, 3, 4 ,5 ,6 ,7 ,8, 9].


myString.replaceAll("\\D", "");

For a summary of regular-expression constructs and information on the character classes supported by Java pattern visit the following link.

The \\D pattern that we used in our code is a predefined character class for non-digit characters. It is equivalent to [^0-9]that negates the predefined character class for digit characters [0-9].


How to find lines that contain only lowercase characters

To print all lines that contain only lower case characters, we used the following regular expression in grep:


egrep '^[[:lower:]]+$' <file>;
#If you do not have egrep, use
grep -e '^[[:lower:]]+$' <file>;

Breakdown of the above regular expression:

  • ^ instructs the regular expression parser that the pattern should always start with the beginning of the line
  • [[:lower:]] this special instruction informs us that only lower case characters can match it
  • + the plus sign causes the preceding token to be matched one or more times
  • $ signifies the end of the line

Regular expression to match any ASCII character

The following regular expression will match any ASCII character (character values [0-127]).

[\x00-\x7F]

The next regular expression makes the exact opposite match, it will match any character that is NOT ASCII (character values greater than 127).

[^\x00-\x7F]

 

gEdit - regular expression to match any ASCII character

gEdit – regular expression to match any ASCII character

gEdit - regular expression to match any Non-ASCII character

gEdit – regular expression to match any Non-ASCII character