James Stanley


Hi! Check out my new project:
AI Test User Logo

AI Test User

AI Test User is a testing tool that simulates your actual customer experience - from email signup to MFA login to completing real product workflows. Unlike traditional testing tools that only test isolated components or pages, we test your entire user journey exactly as your customers experience it.

Try it now »
(thanks for indulging my quick promo, it won't be a permanent thing)

How to interrupt a regex in Perl

Wed 23 March 2016

Since 5.8.0, Perl's "safe signals" defers the delivery of signals when a custom signal handler is in use, until it is at a safe point to handle them. This means you can not simply use alarm() to interrupt a long-running regex.

It is simple enough to create a child process to run the regex match, and use the default SIGALRM handler in the child to allow it to be timed out. Here is an example function to run a regex match with a timeout:

sub match {
    my ($string, $regex, $timeout_secs) = @_;
    
    my $pid = fork();
    die "can't fork: $!" if !defined $pid;
    
    if ($pid == 0) {
        # child process
        $SIG{ALRM} = 'DEFAULT';
        alarm $timeout_secs;
        
        exit(($string =~ $regex) ? 0 : 1);
    }
    
    # parent process
    waitpid($pid, 0);
    
    die "regex timed out\n" if $? & 0x7f;
    return !($? >> 8);
}

This child process instates the default SIGALRM handler, starts an alarm, and checks if the string matches the regex. It exits with 0 status if the string matches, and 1 otherwise.

The parent process waits for the child to exit. $? is the exit status. "$? & 0x7f" tells us which signal, if any, the child died from (we just assume it was SIGALRM). "$? >> 8" tells us the process exit status, which tells us whether the regex matched or not.

Given this information, the parent process either dies with "regex timed out\n" or returns 1 if the regex matched and 0 otherwise.



If you like my blog, please consider subscribing to the RSS feed or the mailing list: