spicnspan.pl – “clean up” text files

At times I get tired of editing code with tabs. And by “at times”, I mean every time. I hate tabs in code. I hate trailing spaces in code too (makes it more difficult to navigate using vi/vim). And more than anything, I hate Windows CR/LF (carraige return/line feed) line terminator characters.

So I wrote a script to take care of it. I call it “spicnspan”. To use: spicnspan

Simple enough.

This script:

  • Converts all tab chars in a text file to 4 spaces.
  • Converts Windows-style CR/LF line terminators to UNIX newline chars (0x0a).
  • Removes trailing spaces.

If I were to re-code this, I would allow for multiple filenames on the command-line. I would also make separate smaller functions, one each for:

1. Slurping the file data
2. “Fix” ing (i.e. remove spaces, etc) the file data
3. Writing the file back to disk.

This is done now. The code below has been updated to reflect these changes, as well as the direct link.

Here’s a direct link to the code (save as .pl on your computer, or whatever you want if you have a real OS):

spicnspan


#! /usr/bin/perl
#
# name: spicnspan
# description: Removes tab chars (converts to 4 spaces) & trailing spaces
# from code.
# developer: Nathan G. Marley
# date: 2010Ene19
#
# change history:
# ========================================================================
# description: change description placeholder
# developer: developer name goes here
# date: YYYYMmmDD
# ========================================================================

use strict;
use Data::Dumper;
use File::Basename;
use Carp;

# boilerplate
my $progname = basename($0);
my $usage = "usage: $progname ...";

# main section
foreach my $filename ( @ARGV ) {

# slurp file data
my $indata = &slurp_file( $filename );

# clean file data
my $outdata = &scrub_data( $indata );

# write file data to disk
&write_file( $filename, $outdata );

}

# subroutines...

sub slurp_file() {
my $codefile = shift;
my $data;

if ( ! -f $codefile ) {
print STDERR "error: '$codefile' doesn't exist or not a regular file.\n";
}

open(IN, "< $codefile") or die "Can't open '$codefile': $!"; { local undef $/; $data = ; }
close(IN);

return $data;
}

sub scrub_data() {
my $indata = shift;
my $outdata = $indata;

# strip Windows-style linefeeds
$outdata =~ s%\x0d%%g;

# convert all tabs to 4 space chars
$outdata =~ s%\t% %g;

# remove any trailing spaces
$outdata =~ s% +\n%\n%g;

return $outdata;
}

sub write_file( $codefile, $outdata ) {
my ( $codefile , $outdata ) = (@_)[0,1];
open(OUT, "> $codefile") or die "Can't open '$codefile' for writing: $!";
print OUT $outdata;
close(OUT);
}