[CS-FSLUG] Sed and other tools

Robert Wohlfarth rbwohlfarth at gmail.com
Fri Jul 8 09:56:58 CDT 2011


On Fri, Jul 8, 2011 at 9:12 AM, Ed Hurst <ehurst at soulkiln.org> wrote:

> Having never messed with Perl, I can scarcely begin to parse this. What
> I need now is how to tell this script to check all the HTML files in a
> given directory. I saved it as change.pl and learned how to check the
> syntax. It's fine, but it does nothing as it stands.


Sorry about that, Ed. Some documentation would be useful...

Okay, to change the HTML files run the script like this: perl change.pl*.html

You will end up with a directory full of files that end with .html.new.
Check a couple of the .html.new files. I never really trust my own scripts
and always want to eyeball the results first. Once you're satisfied that the
*.html.new files look okay, rename them to .html.

How the script works...


> foreach my $file (@ARGV) {
>
Loop through all of the files passed on the command line.


> open my $in, "<$file";
>
Open the file for reading.


> my @slurp = <$in>;
> close $in;
>
Read the entire file into memory, split apart by lines. At this point, it's
just like sed.


> my $all = join( "\n", @slurp );
>
Combine all of the lines into one long string. We need this to check for the
pattern across lines. This is where Perl has the advantage over sed.


> $all =~ s/softedges\s*at\s*softhome\s***dot\s*net/eddie at soulkiln dot
> org/m;
>
Replace the text. The *\s** matches any whitespace - including a newline.


> open my $out, ">$file.new";
> print $out $all;
> close $out;
>
Write the altered contents out to a different file. Overwriting the
originals makes me nervous. I like to eyeball the changes before committing.


> }
>

-- 
Robert Wohlfarth
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ofb.biz/pipermail/christiansource_ofb.biz/attachments/20110708/33e7a341/attachment.htm>


More information about the Christiansource mailing list