Thursday, January 31, 2013

Useful Regular Expressions

Regular Expressions, or RegEx, are very powerful string parsers that are usually built into most popular programming languages. Both MEL and Python have a version that can be used to speed up your every day scripting tasks.

Just the other day, I rewrote my sav+.mel script in python; it versions up the current file with the click of a button... It's so convenient to use and made me think of sharing some of the tools that I used to write it.

Anyway, in MEL, two commands you should know are gmatch and match. gmatch lets you know if it found a match, by returning a boolean, and match returns the string result. Let's say you wanted to extract information from a path:
string $path = "C:/folder/image01.jpg";
// Note: Windows users might want to look
// into using substituteAllString($str, "/", "\\")
// to replace those '/' with '\' or
// vice versa, as needed 

There are a few important pieces of information that we can immediately extract using RegEx. To extract the file name with its extension, try:

// This says: Starting from the end of
// the line ($), extract every character
// that is not a '/' or '\'
// As soon as it finds an '/' or '\',
// it stops the match and returns the
// string it gathers thus far
string $fullName = `match "[^/\\]*$" $path`;

To extract the file name, try:

// This expression matches everything from the
// beginning of the line (^), except '.'
// After it encounters the '.', it ceases to
// match and returns the string
// Notice we are feeding it the result from
// the previous line above... 
string $fileName = `match "^[^.]+" $fullName`;

To get the extension by itself, execute this line:

// This says: From the end of the line,
// find anything up until the '.' or '\'
// Note that if there isn't a file extension,
// the $result will match the original $path
string $ext = `match "[^/\\.]*$" $path`;

To extract any trailing version numbers at the end of the file name, try:

// This says: From the end of the line,
// extract some digits. It's a good idea
// to check if the file name even trails
// with numbers before extraction by using
// gmatch with the same expression
string $ver = `match "[0-9]+$" $file`;

Now that you have all this data extracted, store it as useful information in an object. Unfortunately, MEL isn't an Object Oriented language so there is no straight forward way to create a class of objects that encapsulates attributes and behaviors. You have full support for these things when using C++ and Python. Here's how I would store the info in MEL:

// Now all the information is neatly packed in an array
string $fileInfo[] = { $path, $fullName, $fileName, $ext, $ver };
print $fileInfo;

If you guys and gals would like to know how to extract other pieces of data from strings, let me know. I'll be happy to discuss similar topics as well. I'll write up another post.