How I learned Regex
About me
I'm a passionate software engineer employed since 2000, having worked for such companies as IBM, Amdocs, Volition (THQ), Wolfram, STATS, Nielsen, and (currently) Truss Holdings. I'm a big proponent of science/tech education and free speech. I spend my free time split between my wife and children and training brazilian jiu-jitsu.
Why I wanted to learn Regex
Sometimes you just need a powerful, focused tool to pick apart strings. Regular expressions are just that. There's an old saying that a software engineer confronted with a problem might say, "I'll use regular expressions," and thus they now they have two problems. There is some truth to that humorous quip, but it belies the power in regular expressions. Done with care and skill, regex is a powerful skill every software engineer should possess.
How I approached learning Regex
I got started with regex how most software engineers probably do: I saw it in someone else's code that I needed to debug/modify as part of my job. I started my quest to learn this odd language-of-sorts by browsing the web for tutorials and quick references. Today we have many websites that provide rich interactive displays of regex pattern matching (i.e. regexr.com, regex101.com), but back then I wasn't aware of any. So, I was stuck with pattern quick references, blog posts, and my own trial and error in code.
Challenges I faced
There are many nuances to the various implementations of regex, and one has to be particularly careful with these. For instance, some flavors of regex treat newlines differently with regard to the period (.) pattern element. Another challenge is that complicated regex patterns can be a bit of a pain to read and digest. Often I would find myself having to make step-by-step notes of how a regex engine would parse various patterns and inputs.
Key takeaways
Learning regex really hammered home something I probably took for granted: the more powerful the tool, the more care should be taken in it's understanding and use. You hear plenty of horror stories about poorly crafted regular expressions causing engineers many headaches down the road. What seemed to be the best way to pick apart string data quickly became a problem of its own. I believe that regex has a specific set things it does better than anything else, but should be employed in a fairly narrow range of uses. For instance, it can be a fun exercise to use a couple regex patterns to scrape every bit of data from a large webpage, but it's generally not a tenable and stable solution for that task.
Tips and advice
Start small. Try out simple patterns on a site like regex101.com, and notice how the site breaks down which pattern elements match which input text, and why. Challenge yourself to construct more complicated patterns. Learn to use lookahead and lookbehind patterns. Learn about greedy vs. lazy matching.
Final thoughts and next steps
I'm now looking to solidify my understanding of some of the finer and more technical points in the world of regex, in particular the variations in the different implementations of regex. As always, the learning never ends. Whenever you think you understand something completely, you're probably only halfway there!