rule

truthfultemporarily@feddit.org · edit-2 4 days ago

Curly braces are number of characters. Round braces are capture groups - their content can be used in search and replace. So abbba and regex a(b{3})a. Capture group 1 would be bbb.

“\” is usually some specific character. “\d” is any digit. “\s” is any whitespace character.

The thing on the board I don’t think is pure regex but a search and replace command and I think its wrong. It matches “272-43-17382” or similar digits. The 1 and 2 are usually capture groups in awk but on the board there is a “$” behind it which usually means end of string which doesn’t make sense. Should be I think:

“s/^(\d{3})-(\d{2})-(\d{5})$/\2 \1/g”

“Globally replace a string formatted like 273-34-27472 with 34 273”

Dæmon S.@catodon.rocks · 4 days ago

!onehundredninetysix@lemmy.blahaj.zone
There’s something else: the backslash followed by a positive natural number means a reference to the nth capture group, so:

"truthfultemporarily".match(/(t)(r)(u)\1hf\3(l)\1empo\2a\2i\4y/)

…as esoteric as it may sound, will match your Lemmy username, because the \1 will correctly match the first capture group which is t, \2 will match the second capture group which is r, and so on so forth… Oh, and it works beyond .replace contexts, during .match as well.

Source: I just learned through this very meme and, from now on, I’ll likely use this feature whenever I have to use RegExp because I love coding cryptic one-liners just for the sake of it.

Screenshot of DevTools illustrating the working of the aforementioned snippet, with its output correctly matching the string.

Bubs@lemmy.zip · 4 days ago

Heheh

I love bringing out all the nerds to talk on random niche topics :3

Zoop@beehaw.org · 3 days ago

Right!? I love what you’ve caused here. Especially since I was wondering the same thing myself. I’m really enjoying these neat, informative replies! I <3 NERDS!

Arthur Besse@lemmy.ml · edit-2 4 days ago

from the /g at the end i agree it looks like it could be a malformed attempt at an awk/perl/etc substitution operation, and your rewrite of it as an s/// does work, but the parts between the ^ and $ would also be a valid regexp in Perl-compatible regexp and some other dialects if not for the spaces at the start and end. And, the /g is also a flag (“Match globally, i.e., find all occurrences.”) for the m/// matching operator in Perl.

The \1 and \2 are backreferences to the capture groups, which can be used not only in the replacement part but also in the pattern itself.

You can see this working by running this command:

echo '123 - 45 - 67890 45 123'|perl -ne 'print if m/^(\d{3}) - (\d{2}) - (\d{5}) \2 \1$/g'

…which will echo the string because it matches the pattern. (if you edit the input string to change, for instance, the last digit, it will no longer match and will output nothing.)

There is no input that can match the pattern as it is in the comic with the space before the ^ and after the $ however.

Interestingly backreferences are also supported by POSIX Basic Regular Expressions (BRE), but are not supported by POSIX Extended Regular Expressions (ERE). (Also the former requires you to escape parenthesis and curly braces for them to become meta characters, while the latter requires you to escape them if they’re literals as Perl etc do. And neither of the POSIX flavors supports \d as a shortcut for [0-9].)

Voroxpete@sh.itjust.works · 4 days ago

from the /g at the end (and the spaces on the edges) i agree it looks like a malformed attempt at an awk/perl/etc substitution

The /g at the end is the global operater. It means, roughly, match across the entire input string.

This is completely valid regex, not a malformed attempt at anything. It’s just that the delimiters and operators are often omitted from regex in practical use so you may not be used to seeing them.

Arthur Besse@lemmy.ml · 4 days ago

yeah, i edited my comment while you were replying to note that /g is a valid flag for m/// as well. it is a valid perl matching operation precisely as-is but it can’t match anything due to the spaces it has before the ^ and after the $.

NaibofTabr@infosec.pub · 4 days ago

This definitely seems like a possible use case, but personally I think practical application of sed would be a bit advanced for a “Regex 101” course.

Bubs@lemmy.zip · 4 days ago

Yeah, I think I’m catching the general idea.