.htaccess removing unwanted sub-folders from RHS
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

Using latest WordPress with .htaccess on Apache

Too many URL's are arriving with 'junk' sub-folders at the end that are causing 404's.
All these unwanted sub-folders have various combinations of digits (no idea what's generating them). Usually 2,3 or 4 digits per sub-folder.

https://office-watch.com/2012/privacy-law-and-cloud-storage/875/697
should be
https://office-watch.com/2012/privacy-law-and-cloud-storage/

https://office-watch.com/2014/9-hidden-extras-in-skype/15/086
should be
https://office-watch.com/2014/9-hidden-extras-in-skype/

Note: there's no trailing backslash.

In other words, we need a .htaccess Rewrite that:

  • removes any sub-folders after the first two sub-folders
  • IF the sub-folders or trailing end has only digits.

But NOT when there's a file name such as
https://office-watch.com/2016/excel-stock-prices-from-google-finance/favicon.ico is OK (ie no folder slash at end).
Also /amp/ is OK
https://office-watch.com/2014/9-hidden-extras-in-skype/amp/ is OK.

.htaccess has a long section added by a security plug-in then this:

BEGIN GD-SSL

Options +FollowSymLinks
RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteCond %{HTTP_USER_AGENT} .+$
RewriteCond %{SERVER_NAME} office-watch.com$
RewriteRule .* https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]
Header add Strict-Transport-Security "max-age=300"

END GD-SSL

BEGIN WordPress

RewriteEngine On
RewriteBase /
RewriteRule index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

END WordPress

BEGIN WordPress

RewriteEngine On
RewriteBase //
RewriteRule index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . //index.php [L]

END WordPress

awarded to Wuddrum

Crowdsource coding tasks.

2 Solutions


Hey there. I made this .htaccess rewrite rule to do what you described. Honestly I haven't tested it since rn I'm using nginx, but I checked its syntax with an online tool and it seemed to be ok.

Let me know if it works.

RewriteRule ^([a-zA-Z0-9\-]*)/([a-zA-Z0-9\-]*)/[0-9\/]+$ $1/$2 [L]

it should do what you described, only rewrite the url if there's more than 2 folders, and only if the other folders only contains digit characters

Alas no. Doesn't seem to make any difference. Just to make sure we've done this right: # BEGIN OW Hack to remove junk sub-folders from RHS <IfModule mod_rewrite.c> RewriteEngine On RewriteRule ^([a-zA-Z0-9\-]*)/([a-zA-Z0-9\-]*)/[0-9\/]+$ $1/$2 [L] </IfModule> # END
vanmorgan 4 months ago
Hmm, it's been long time since I've used htaccess rewrites, try replacing [L] with [R,L] I think R there means redirect
alv-c 4 months ago
Tell me when you try this out
alv-c 4 months ago
Winning solution

You can try this:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^(.*?)/(.*?)/[\d\/]+$ $1/$2 [L]
</IfModule>

If you have any pre-existing rules, then put this before any of them. For example:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /

RewriteRule ^(.*?)/(.*?)/[\d\/]+$ $1/$2 [L]

RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

Tested & working on my apache setup.

EDIT:
Did you try it in this order?

# BEGIN GD-SSL
<IfModule mod_rewrite.c>
Options +FollowSymLinks
RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteCond %{HTTP_USER_AGENT} .+$
RewriteCond %{SERVER_NAME} 127.0.0.1$
RewriteRule .* https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]
Header add Strict-Transport-Security "max-age=300"
</IfModule>
# END GD-SSL

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteBase /
RewriteRule ^(.*?)/(.*?)/[\d\/]+$ $1/$2 [L]
</IfModule>

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase //
RewriteRule index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . //index.php [L]
</IfModule>
# END WordPress

I tested this and it forces SSL + removes the junk numerics in one go.

Your code works ... at least in part. I can get it to work with an otherwise blank .htaccess (as it should) but not with the other necessary rewrites and tests, in particular to force SSL. It was possible to get the Rewrite to work but only with [R=301,L] which means all the other tests, including forcing SSL are missed. But putting your code AFTER the SSL block doesn't work at all. I've added the main parts of the .htaccess to the question and increased the bounty since this is obviously more complicated than a single Rewrite Rule line.
vanmorgan 4 months ago
See if my edit makes everything work ok, it works for me.
Wuddrum 4 months ago
Actually, it works for me if I put it in any order, be it first or last. You can also try it with [R=301,L], if the edit doesn't work on its own. Everything's still working fine for me, even with [R=301,L] and as a first rule. EDIT: Correction, it works before/after SSL block, but only if placed BEFORE wordpress blocks, so you should stick to the ordering I've presented in the edit.
EDIT2: Also, the rule itself should look like this, when using R=301 - RewriteRule ^(.*?)/(.*?)/[\d\/]+$ /$1/$2 [R=301,L]
Wuddrum 4 months ago
View Timeline