Blog
Best posts
About me
Shop
CAD Dojo

Party Puzzling
Hardbin
URL Canary
SCAMP CPU
Collapse OS docs
Seasonal.css
Stegoseed
Image Steganography
Mojibake Steganography
Chess Steganography
4x4 Chess Puzzle
Chess Clock
Anagram Deputy

This site is part of the webring: Tech Makers
«prev random next»

James Stanley

How to interrupt a regex in Perl

Wed 23 March 2016

Since 5.8.0, Perl's "safe signals" defers the delivery of signals when a custom signal handler is in use, until it is at a safe point to handle them. This means you can not simply use alarm() to interrupt a long-running regex.

It is simple enough to create a child process to run the regex match, and use the default SIGALRM handler in the child to allow it to be timed out. Here is an example function to run a regex match with a timeout:

sub match {
    my ($string, $regex, $timeout_secs) = @_;
    
    my $pid = fork();
    die "can't fork: $!" if !defined $pid;
    
    if ($pid == 0) {
        # child process
        $SIG{ALRM} = 'DEFAULT';
        alarm $timeout_secs;
        
        exit(($string =~ $regex) ? 0 : 1);
    }
    
    # parent process
    waitpid($pid, 0);
    
    die "regex timed out\n" if $? & 0x7f;
    return !($? >> 8);
}

This child process instates the default SIGALRM handler, starts an alarm, and checks if the string matches the regex. It exits with 0 status if the string matches, and 1 otherwise.

The parent process waits for the child to exit. $? is the exit status. "$? & 0x7f" tells us which signal, if any, the child died from (we just assume it was SIGALRM). "$? >> 8" tells us the process exit status, which tells us whether the regex matched or not.

Given this information, the parent process either dies with "regex timed out\n" or returns 1 if the regex matched and 0 otherwise.

If you like my blog, please consider subscribing to the RSS feed or the mailing list:

James Stanley - james@incoherency.co.uk | jesblogfnk2boep4.onion | [rss]