I sat down to edit the Professions page in "Source" mode and encountered a bit of a mess, so set out to clean it up. You can view the diff here, if you're curious: A lot of this was simple find/replace.

However, the trickiest thing was taking markup like this:

[ Marine Biologist]

And turning it into markup like this:

[[Marine Biologist]]

To do so, I used the regex find and replace tool in PHPStorm. It looks like this:

Screen Shot 2013-08-15 at 1.30.04 AM

PHPStorm + regular expression find/replace

The regex to find is:

\[[a-zA-z0-9_%]+)\s([a-zA-z ']+)\]

Breaking this apart, it means...

  • Find any instances of the wikia URL in single braces (braces are special, so you need to escape them with "\"):
\[ ... \]
  • Now find the "raw URL" version of the name. This can include one or more (the "+" sign) upper/lowercase letters, numbers, underscores and the % sign (in the case of apostrophes in e.g. "Dentist's Office," which get encoded as "%27"):
  • Followed by a space...
  • Then finally, the "human readable" version of the link. This includes *actual* apostrophes, as well as spaces, but no numbers or underscores. We put this in parentheses so we can refer to this later.
([a-zA-z ']+)

Finally, just replace that with:


This means take the *first* group of parentheses it encounters and stick whatever's in between them (in our case, the human readable name of the wiki link) between double brackets. If we had put parentheses around, say, the URL part as well, we would refer to that as $1 (because it comes first) and the human-readable version would be $2, and so on.

And... voila! :)