Using Perl to edit keyword files (and decoding regex)

Perl is a language which is best suited to manipulating text. It is efficient, and free to use (after downloading it). So you can quickly imagine it is not the most useful language for structural engineers. That is almost true, one example where it can be useful is editing keyword files. Below is an example of a problem I came across and used Perl to solve (although there are a number of different ways to solve it not using coding!).

The problem was segments (effectively 4 noded 2D elements) with their normal pointing in the wrong direction. The model was to be used for blast analysis in LS-DYNA so getting the normal in the right direction is pretty important, otherwise the loads end up going the wrong way!

The normal is defined by the order of nodes. To swap the normal all you need to do is swap nodes 2 and 4 around in the segement topology definition. Simple!

Below is the code I used:

# setup up our filenames...

my $fileInName = 'v:\myfile.k';

my $fileOutName = 'v:\myfile_mod.k';

open my $fileIn,"<", $fileInName or die "can't read input file $fileInName"; # open the text file to read

open my $fileOut,">", $fileOutName or die "can't read file $fileOutName"; # open the text file to write to

while (<$fileIn>){

print $fileOut "$_"; # print the line, we will catch any lines to change later

if ($_=~m/^\*SET_SEGMENT/){ # *SET_SEGMENT keyword found...

print $fileOut ($_=(<$fileIn>)); # always keep the next line, it is the line defining the SSID..

if ($_=~m/^\s{9}1/){ # if SSID == 1...

while (($_=(<$fileIn>))!~m/[\*,\$]/){ # do some operation until we hit a comment, or new keyword...

my ($n1,$n2,$n3,$n4) = unpack("(A10)4",$_); # split in to fields of 10 characters, we will lose the next line character

print $fileOut pack("(A10)4",$n1,$n4,$n3,$n2)."\n"; # repack and send to our new file, with flipped nodes


print $fileOut "$_"; # finally print the line which exits the last while loop, and we are finished!

} # end of the second if

} # end of the first if

} # end of the while loop

close($fileIn); # close the files

close($fileOut); # close the file

So lets have a look at what is going on here! Here are some basic points to understanding Perl”

  • # is the comment character in Perl.
  • $ precedes a variable name, so up at the top we declare two string variables, our filenames to use.
  • There is a special variable, $_ which is always the last result, much like “ans” in Matlab.
  • Lines must terminate with a semi colon unless it is a loop, otherwise Perl just won’t compile the script!

Ok, variables set up, note we are not declaring their type, there is no need to. Now we open the files. The files become variables $fileIn and $fileOut, note that “<” is opening the first to read and “>” is opening the other to write.

We also have an or die statement which will execute if there is a problem opening the files. Note that the following comment hasn’t got any special concatenation rules, Perl sees the $ in the string and assumes you want to include the variable.

So we open the file  using <$fileIn>, and do something while the file is open. The logic loop is bound by curl braces {} which would be different to those used to Matlab and VBA, but familiar if you have used Java etc.

Everytime we read a line from the file the read position will increment a line, so moving through the file is easy. There is a print statement, which just prints the current line to the output file. Note that we have some logic later in the script which will keep looping through lines when we have found what we are looking for, so while it looks like the print $fileOut “$_” line is printing everything straight to the output file, it is not.

Now we need to have some logic to find what we are looking for. The if statement compares the current variable, $_, i.e. the current line read in, and tries and match it with “*SET_SEGMENT”.  The match command is ~m, and the string we are searching for is bound by the slashes, //. Notes we have a backslash to escape the asterisk, asterisk is a special character in the world of Perl searches, and here we want to use it literally. Finally the ^ indicates we are just searching from the beginning of the line.

Phew. That seemed complicated. But is actually very simple once you know what you are doing.

So we have found a segment set, what next? Well the next line will have the segment set ID on it (that is just the nature of LS-DYNA keyword files), and I only want to flip the segments in set 1, so we march for 1 on the next line. The field width in the keyword file is 10, so there will be 9 characters of blank before hands, so we say \s{9} to represent 9 spaces. \s represents a white space character.

Great we have found the segment set. Every line after this will have the segment defined in by four nodes (segments don’t have element IDs, each line is just the numbers of four nodes defining their topology). Before manipulating the topology of the segments we need to make sure we don’t overshoot the end of the set definition, so we put in another while loop, checking to see it the line begins with either a *, i.e. a new keyword, or a $, i.e. a comment line. Notes that we continue the while as long as we DON’T match these characters, so our match statement becomes !~m, rather than ~m. We have two options, which Perl handles easily in square braces, comma separated. Both are special characters so we have an escape back slash before each. Simple!

Finally we can change the order of nodes. First we read in the four nodes. Again the field width of the keyword file is 10 characters. We know segment will have 4 numbers, so we can read those numbers straight in to 4 variables:

my ($n1,$n2,$n3,$n4) = unpack(“(A10)4”,$_);

The my is letting us define the four new variables, while the unpack is splitting a string in to 4 numbers. It is acting on $_, the second argument. The first argument is telling unpack we have A10 = 10 ASCII characters (the keyword file field width), occuring 4 times, (A10)4. Now we have for four node numbers in four variables. We just need to reorder them and write them to the output file.

We used unpack to deconstruct the string, so it is no surprise we use pack to reconstruct it, with the same syntax. Unlike when we used unpack, with pack we assign the output to be printed to $fileOut.

That when the while statement finds the end of the segment set definition (by finding a * or $ character on the line) we need to print that line to the output, otherwise it will be lost.

And we are done. So what is this “decoding regex” I talked about. Well regex = regular expressions, a common way of matching strings in files. Our match commands (~m) used regex. The compactness of the code speaks volumes about how useful learning regex is. The benefits are further magnified when you realise that Vim uses regex – Vim is pretty much the best way to edit and manipulate long keyword files of models!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: