Regex Samples

From HaFrWiki42
Jump to navigation Jump to search

Regex Samples Figures 1

Always be precise in the way you specify your data. The following question can be multiple interpreted.
Create a regular expression that matches the figures:

  • 1 - 49
  • 01 - 49

Of course the ^[0-4]?[0-9] is much too simple.

Matches correctly but recognize 11 first as 1 and 1
Matches correctly but recognize 11 first as 1 and 1
Matches correctly but recognize 51 as 5 and 1
Matches all correctly

Regex Samples Figures 1

Always be precise in the way you specify your data. The following question can be multiple interpreted.
Create a regular expression that matches the figures:

  • 25 - 67
Matches correctly 25 - 27 but only on one line
Matches correctly all figures 25 - 67

Regex Email Address

Always be precise in the way you specify your data. The following question can be multiple interpreted.
Create an email checker for input validation of an email.

Email checker
Please note
  • The case insensitive option has to be set. [1].
  • In the above given example the first line is not matched because there is an hidden space behind the email address!
    Beware for such conditions, because the 'Show invisibles' does not show these characters!
Please note that not all strange entries will work. There seems to exist an email address with the top level domain name museum.

If you want to validate such addresses to than you have to use something like:
^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.(?:[A-Z]{2}|com|org|net|edu|gov|mil|biz|info|mobi|name|aero|asia|jobs|museum)$
If you use a more simple regex there is a trade-off. Look at:
^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}$
Which allows email addresses like xx.yy@domain.office, which is most likely someone who forget the real extension like .nl.

There is an Official Standard: RFC 5322 regex for email addresses, but alas also not full proof. Here it is:
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)* | "(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f] | \\[\x01-\x09\x0b\x0c\x0e-\x7f])*") @ (?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])? | \[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3} (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]: (?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f] | \\[\x01-\x09\x0b\x0c\x0e-\x7f])+) \]) . Also not full proof.


Compromize

Email Checker Compromize
Will not allow strange preceding characters. A full proof email checkers does not exist.

HTML Tags

The tags in html for wikipedia may contain a cite reference (a tag <sup id="cite_ref-xxxx">....</sup>).
When you copy such a text you do not want to have them. [2]

Regex Description
/<sup\b id="cite_ref[^>]*>(.*?)<\/sup>/ The regex has one limitation, it can not cope with CR/LF within the tags.
/<sup\b id="cite_ref[^>]*>(.*?)<\/sup>/gs Works in the online tools, but not in PHP the option g is not allowed.
/<sup\b id="cite_ref[^>]*>(.*?)<\/sup>/s Works in PHP and in the preg_replace.

See also

top

Reference

top

  1. Regular Expressions, This example of the email checker is discussed in more details on the webpage of Jan Goyvaerts.
  2. reg101, Cite Ref wikipedia example.