Wednesday, October 26, 2016

Removing Double Redirects or Redirect Chains form .htaccess files.

If you manage and large site using mod_rewrite or ISAPI_Rewrite 3 you will eventually run into issues maintaining the rewrite rules. After doing a few redesigns or moving things around repeatedly you can end up with situations where one page redirects to another page then another and so on in a chain. This is functional but inefficient, it can have a negative effect on user experience and on SEO value, and makes everything harder to maintain. This tool is a simple script you can pass you .htaccess file through to remove all these multi-step redirects and all pages redirect immediately on the first redirect to the final destination page.

This is a perl program that you will need to save along with a copy of your .htaccess file someplace with Perl installed.
#!/usr/bin/perl
# Get input file.
my $file = $ARGV[0] or die "Need input file\n";
open(my $data, '<', $file) or die "Could not open '$file' $!\n";
open(my $out, '>', "$file.new");
my $redir = 0;
my $double = 0;
my %adata;
# Build a list of all rewrites.
while (my $line = <$data>) {
if( $line ~~ /^RewriteRule \^/ ) {
my @parts = split(/\s+/, $line);
my $source = $parts[1];
$source =~ s/[\^\$]//g;
$source = "/$source";
my $destination = $parts[2];
print "$source -> $destination\n";
# Read all data and group by state then zipcode.
$adata{$source} = $destination;
$redir++;
}
}
close $data;
open(my $data, '<', $file) or die "Could not open '$file' $!\n";
# Process the file again this time applying updates.
while (my $line = <$data>) {
if( $line ~~ /^RewriteRule \^/ ) {
# Cehck each rewrite rule to see if it's destination has another rule.
my @parts = split(/\s+/, $line);
my $source = $parts[1];
$source =~ s/[\^\$]//g;
$source = "/$source";
my $destination = $parts[2];
if($adata{$destination}) {
#print "$source -> $destination -> $adata{$destination}\n";
# Print a replacment rule from the source to the real destination.
print $out "$parts[0] $parts[1] $adata{$destination} $parts[3]\n";
$double++;
} else {
# If not jsut print the line as is.
print $out $line;
}
} else {
# Output all non-rewrite likes as is.
print $out $line;
}
}
# Print some summary text.
print "$redir Redirects Found, $double Double Redirect.\n";
close $data;
close $out;

Once saved you can run it by opening a terminal and running the following command:
perl simplify-rewrites filename

The Program will show a summary and a new filename.new file will be generated with the de-duping applied. You can use the following to get a list of all the changes for review.
diff -Na --unified=0 rw.txt rw.txt.new | less

No comments: