Regex Samples: Difference between revisions
(14 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
{{TOCright}} | {{TOCright}} | ||
== Regex Samples == | == Regex Samples Figures 1 == | ||
Always be precise in the way you specify your data. The following question can be multiple interpreted. | Always be precise in the way you specify your data. The following question can be multiple interpreted. | ||
<br>Create a regular expression that matches the figures: | <br>Create a regular expression that matches the figures: | ||
Line 17: | Line 17: | ||
|- | |- | ||
| [[File:Regex-21-004.png|thumb|center|750px| Matches all correctly]] | | [[File:Regex-21-004.png|thumb|center|750px| Matches all correctly]] | ||
|} | |||
== Regex Samples Figures 1 == | |||
Always be precise in the way you specify your data. The following question can be multiple interpreted. | |||
<br>Create a regular expression that matches the figures: | |||
* 25 - 67 | |||
{| class="wikitableharmcenter" width="850" | |||
|- | |||
| [[File:Regex-22-002.png|thumb|center|750px| Matches correctly 25 - 27 but only on one line]] | |||
|- | |||
| [[File:Regex-22-001.png|thumb|center|750px| Matches correctly all figures 25 - 67]] | |||
|} | |||
== Regex Email Address == | |||
Always be precise in the way you specify your data. The following question can be multiple interpreted. | |||
<br>Create an email checker for input validation of an email. | |||
{| class="wikitableharm" width="850" | |||
|- | |||
| [[File:Regex-23-001.png|thumb|center|750px| Email checker ]] | |||
|- style="text-align:left;" | |||
| '''Please note''' | |||
* The case insensitive option has to be set. <ref>[http://www.regular-expressions.info/email.html Regular Expressions], This example of the email checker is discussed in more details on the webpage of Jan Goyvaerts.</ref>. | |||
* In the above given example the first line is not matched because there is an '''''hidden space behind the email address'''''! <br>Beware for such conditions, because the 'Show invisibles' does not show these characters! | |||
|- style="text-align:left;" | |||
| Please note that not all strange entries will work. There seems to exist an email address with the top level domain name museum. | |||
If you want to validate such addresses to than you have to use something like: | |||
<br>'''<nowiki>^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.(?:[A-Z]{2}|com|org|net|edu|gov|mil|biz|info|mobi|name|aero|asia|jobs|museum)$</nowiki>''' | |||
<br>If you use a more simple regex there is a trade-off. Look at: | |||
<br>'''<nowiki>^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}$</nowiki>''' | |||
<br>Which allows email addresses like '''xx.yy@domain.office''', which is most likely someone who forget the real extension like '''.nl'''. | |||
|} | |||
There is an ''Official Standard: RFC 5322'' regex for email addresses, but alas also not full proof. Here it is: <br> | |||
'''<nowiki> (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)* | "(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f] | \\[\x01-\x09\x0b\x0c\x0e-\x7f])*") @ (?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])? | |||
| \[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3} (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]: (?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f] | \\[\x01-\x09\x0b\x0c\x0e-\x7f])+) \]) | |||
</nowiki>'''. | |||
Also not full proof. | |||
=== Compromize === | |||
{| class="wikitableharm" width="850" | |||
|- | |||
| [[File:Regex-23-002.png|thumb|center|750px|Email Checker Compromize]] | |||
|- | |||
| Will not allow strange preceding characters. A full proof email checkers does not exist. | |||
|} | |||
== HTML Tags == | |||
The tags in html for wikipedia may contain a cite reference (a tag <sup id="cite_ref-xxxx">....</sup>). | |||
<br>When you copy such a text you do not want to have them. <ref name="wikiciteref">[https://regex101.com/r/Ocir5r/1/ reg101], Cite Ref wikipedia example.</ref> | |||
{| class="wikitableharm" width="900px" | |||
|- | |||
! width="300px" | Regex | |||
! width="600px" | Description | |||
|- | |||
| '''/<sup\b id="cite_ref[^>]*>(.*?)<\/sup>/''' || The regex has one limitation, it can not cope with CR/LF within the tags. | |||
|- | |||
| '''/<sup\b id="cite_ref[^>]*>(.*?)<\/sup>/gs''' || Works in the online tools, but not in PHP the option g is not allowed. | |||
|- | |||
| '''/<sup\b id="cite_ref[^>]*>(.*?)<\/sup>/s''' || Works in PHP and in the '''preg_replace'''. | |||
|} | |||
== See also == | == See also == | ||
Line 26: | Line 88: | ||
[[Category:Index]] | [[Category:Index]] | ||
[[Category:Tools]] |
Latest revision as of 14:33, 24 December 2019
Regex Samples Figures 1
Always be precise in the way you specify your data. The following question can be multiple interpreted.
Create a regular expression that matches the figures:
- 1 - 49
- 01 - 49
Of course the ^[0-4]?[0-9] is much too simple.
![]() |
![]() |
![]() |
![]() |
Regex Samples Figures 1
Always be precise in the way you specify your data. The following question can be multiple interpreted.
Create a regular expression that matches the figures:
- 25 - 67
![]() |
![]() |
Regex Email Address
Always be precise in the way you specify your data. The following question can be multiple interpreted.
Create an email checker for input validation of an email.
![]() |
Please note
|
Please note that not all strange entries will work. There seems to exist an email address with the top level domain name museum.
If you want to validate such addresses to than you have to use something like:
|
There is an Official Standard: RFC 5322 regex for email addresses, but alas also not full proof. Here it is:
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)* | "(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f] | \\[\x01-\x09\x0b\x0c\x0e-\x7f])*") @ (?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
| \[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3} (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]: (?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f] | \\[\x01-\x09\x0b\x0c\x0e-\x7f])+) \])
.
Also not full proof.
Compromize
![]() |
Will not allow strange preceding characters. A full proof email checkers does not exist. |
HTML Tags
The tags in html for wikipedia may contain a cite reference (a tag <sup id="cite_ref-xxxx">....</sup>).
When you copy such a text you do not want to have them. [2]
Regex | Description |
---|---|
/<sup\b id="cite_ref[^>]*>(.*?)<\/sup>/ | The regex has one limitation, it can not cope with CR/LF within the tags. |
/<sup\b id="cite_ref[^>]*>(.*?)<\/sup>/gs | Works in the online tools, but not in PHP the option g is not allowed. |
/<sup\b id="cite_ref[^>]*>(.*?)<\/sup>/s | Works in PHP and in the preg_replace. |
See also
Reference
- ↑ Regular Expressions, This example of the email checker is discussed in more details on the webpage of Jan Goyvaerts.
- ↑ reg101, Cite Ref wikipedia example.