[Seattle-SAGE] Re: [SLL] Need perl help - regex issue

Tim Maher linux at consultix-inc.com
Mon Oct 24 14:41:40 PDT 2005

On Mon, Oct 24, 2005 at 01:39:54PM -0700, Ski Kacoroski wrote:
> Hi,
> I have a problem with regexes. 

IMHO, you're wrong about that; your problems seems to have to do with
misunderstanding of $1, subroutines, and warning messages .

> I am using perl v5.8.5 on linux.  From a 
> perl debug session:
> I create a subroutine that drops everything after the space in the 
> variable $f:
>   DB<18> sub test { $f = "first_in_sub"; $f =~ /(.*)\s.*$/; print $1;}

The matching operator will fail, because the white-space
character required by \s isn't present in $f, and that will
prevent $1 from being set; if you've got warnings enabled,
print will tell you that (assuming $1 wasn't already set
elsewhere in your program).

Assuming your use of the word "drops" above means "stores", 
and you want to lose everything from the /first/ white-space character
to the end, your substitution should look this:

s/^(.*?)\s.*$/$1/;  # ? requests stingy match, to prevent gobbling spaces in .*

Or more simply:
s/^.*? //;  # ? requests stingy match, to prevent gobbling spaces in .*

You need to have warnings turned on, so you'll see a diagnostic messages
when you try to print uninitialized values.
> I then run the same command, but run test() right after it and for some 
> reason, $1 in the subroutine has the match from the $name in the calling 
> program.
>   DB<24> $name =~ /(.*)\s.*$/; print $1;test()
> firstfirst

The mistake you're making is in assuming that a failed match
attempt will unset $1, but it doesn't; for this reason, you
generally don't want to access $1 unless you're sure the match
(or substitution) you just tried was successful.
> I really need to have matches in subroutines work correctly all the time 
> (e.g. return nothing when there is no match).  Appreciate any ideas you 
> have for resolving this.
> cheers,
> ski

Code it like this:

sub test1 {
	my $string=shift;

		$string =~ /^(.*?)\s/ ?  $1 : "" ;
# NOTE: space-less strings won't get returned; is that what you want?

Then use the sub like this:

print test1('first_in_sub last_in_sub');

Warning: code untested!

And by the way, this kind of question is better directed to spug-list at pm.org
| Tim Maher, PhD     (206) 781-UNIX      (866) DOC-PERL     (866) DOC-UNIX |
| tim(AT)Consultix-Inc.Com  http://TeachMePerl.Com  http://TeachMeUnix.Com |
| CLASSES: Perl and CGI Programming, 11/14-18    UNIX Fundamentals: 12/6-9 |
|     Watch for my upcoming book: "Minimal Perl for UNIX/Linux People"     |
|  See http://minimalperl.com for details, ordering, and email-list signup |

More information about the Members mailing list