Some people, when confronted with a problem, think
“I know, I’ll use regular expressions.” Now they have two problems.
I’m not completely sure where I stand as to the universal utility of regular expressions, but I did meet with a problem recently with the markdown parser that our team was using on a project.
The problem was that Wikipedia-style URLs weren’t getting parsed properly:
The specific part that was misbehaving was the parentheses included with the URL. A closing parenthesis has semantic significance in markdown, so the URL was getting parsed only up to the first closing parenthesis.
The issue had already been reported, but wasn’t being resolved after a year of validity.
As I needed an immediate fix, I went to the source, soon finding myself amidst regular expressions like the following.
(The actual code divvies it up into two parts, “inside” and “href”, which, when combined, look like the above. Refer to: https://annot.io/github.com/chjj/marked/blob/38f1727/lib/marked.js?l=455)
Sure, some regex wizards wouldn’t even break a sweat, but trying to analyze and fix a blob like that can still be a pain for many of us. Then I remembered having seen a service called regexper.com, a site that helps turn regular expression strings into something like this.
Once visualized, the logical structure can be seen at a glance, which made it a breeze to find the exact part I needed to fix. In fact, you can hit the “display” button all you want to make sure you’re modifying things correctly.
Anyway, this was how the regular expression turned out for me.
OK. It got a bit longer, and it now detects parentheses within URLs.
But it’s still hard to see the correlation between the previous sentence and that string. And “diff”ing only makes for a good headache. So what if we compare images like this instead?
The modified part is group #2, and you can plainly see how the modification breaks out two conditions by taking into account the existence of parentheses.
Visualizations such as these are important, especially during code review or when trying to explain your logic to colleagues. Regular expressions that were once hard to describe using only words, with the help of imaging, are now much easier to express and understand.
So, satisfied with the results, and since I was on the subject, I thought I’d apply “regexper” on all the regular expressions that showed up in the “marked” module I was working with. And share the resulting images: