Comprehensive Guide to .htaccess
.htaccess configures the way that a webserver deals with a variety of requests. Quite a few webservers support it, like the well-known Apache software that the majority of commercial web-hosting providers tend to favour.
.htaccess files work at the directory level, which lets them supersede the universal configuration settings of .htaccess commands that are further up in the directory tree.
Why is it called .htaccess – basics explained
This type of file was initially used to limit user access to specific directories, and the name has just stuck.
It uses a subset of Apache’s http.conf settings directives that give a sys admin control over who has access to each directory. It looks to an associated .htpasswd file for the usernames and passwords of those people who have permission to access them.
.htaccess still performs this valuable function, but it’s a file that’s grown in versatility to do more besides.
Where will I find the .htaccess file?
Any decent .htaccess tutorial will tell you that theoretically you could find one in every folder (directory) on your server, but typically, the web root folder (the one that contains everything of your website) will have one. It will usually have a name like public_html or www.
If you’ve got a directory with numerous website subdirectories, you will typically find an .htaccess file in the main root ( public_html ) directory, and one in all of the subdirectories (/sitename) too.
Why can’t I find the .htaccess file?
With the majority of file systems, file names that start with a dot ( . ) will be hidden, so by default, you won’t be able to see them.
You can still get to them though. If you look at your FTP client or File Manager, you will likely find a setting to “show hidden files.” It may be in some other location depending on which program you use, but you’ll usually find it if you look under “Preferences”, “Settings”, “Folder Options” or “View.”
What if I don’t have an .htaccess file?
The first thing to establish is that you definitely don’t have one. Check that you have set the system to “show hidden files” (or whatever it’s called on your system) so that you can be sure it really isn’t there. You should have a. htaccess file as they’re frequently created by default, but it’s always worth checking.
If you’ve looked everywhere and you still can’t find one, never fear because .htaccess basics are not hard to understand. You can make one by opening a text editor and creating a new document.
It should not have the .txt or any other file extension, just .htaccess, and make sure that it’s saved in ASCII format (it should not be in UTF-8 or anything else) as .htaccess.
Transfer it to the right directory using FTP or the file manager in your web browser.
What’s an error code?
One of your simple .htaccess basics is setting up error documents.
Any decent .htaccess tutorial will tell you that every time a web server receives a request it will attempt to respond to it, most often by offering up a document (as it does with HTML pages), or by pulling that response from a particular application (as happens with Content Management Systems and other web apps).
If this process trips up, then the server reports an error and its corresponding code. Different types of errors have different error codes, and you’ve probably seen a 404 “Not Found” error quite a few times. It’s not the only one though:
Client Request Errors
- 400 — Bad Request
- 401 — Authorization Required
- 402 — Payment Required (not used yet)
- 403 — Forbidden
- 404 — Not Found
- 405 — Method Not Allowed
- 406 — Not Acceptable (encoding)
- 407 — Proxy Authentication Required
- 408 — Request Timed Out
- 409 — Conflicting Request
- 410 — Gone
- 411 — Content Length Required
- 412 — Precondition Failed
- 413 — Request Entity Too Long
- 414 — Request URI Too Long
- 415 — Unsupported Media Type.
- 500 — Internal Server Error
- 501 — Not Implemented
- 502 — Bad Gateway
- 503 — Service Unavailable
- 504 — Gateway Timeout
- 505 — HTTP Version Not Supported
What Happens by Default?
When an approach to handling errors isn’t specified, the server just sends the message to the browser, which gives the user a general error message, but this isn’t especially helpful.
Creating Error Documents
At this point in your .htaccess guide, you’ll need an HTML document for each error code. You can call them anything you like, but you might want to consider giving them a name that’s appropriate, such as not-found.html or just 404.html.
Then, in the .htaccess file, determine which document goes with which error type.
ErrorDocument 400 /errors/bad-request.html
ErrorDocument 401 /errors/auth-reqd.html
ErrorDocument 403 /errors/forbid.html
ErrorDocument 404 /errors/not-found.html
ErrorDocument 500 /errors/server-err.html
Note that each one gets its own line.
And you’re done. It’s no more complicated than that.
Alternatives to .htaccess for error handling
The majority of Content Management Systems (CMS) like WordPress and Drupal, and the majority of web apps too, will deal with these errors codes in their own way.
Password Protection With .htaccess
As we’ve said, .htaccess files were originally used to limit which users could get into certain directories, so let’s take a look at that in our .htaccess tutorial first.
.htpasswd – this file holds usernames and passwords for the .htaccess system
Each one sits on its own line like this:
Note that this password isn’t the actual one, it’s just a cryptographic hash of the password, which means that it’s been put through an encryption algorithm, and this is what came out. It works in the other direction too, so each time a user logs in, the password text is put through that same algorithm and if it matches with what the user typed, they’re given access.
This is a highly secure way of storing passwords because even if someone gets in to your .htpasswd file, all they’re seeing is a list of hashed passwords, not the real ones, and there’s no way to use them to reconstruct the password either, because the algorithm is a one-way-street.
You can choose from a few different hashing algorithms:
- bcrypt — The securest one but chugging through the encryption process slows it down as a result. Apache and Nginx compatible.
- md5 — The latest versions of Apache use this as their default hashing algorithm, but Nginx doesn’t support it.
Insecure Algorithms — These are best avoided.
- crypt() — was previously the default hashing function, but isn’t a secure option.
- SHA and Salted SHA.
.htaccess tutorial – Adding usernames and passwords with the command line
You can use the command line or an SSH terminal to create an .htpasswd file and add username-password pairs to it directly.
.htpasswd is the command for dealing with the .htpasswd file.
Use the command with the -c option to create a new .htpasswd file, then enter the directory path (the actual path on the server, not the URL). You can also add a user if you want to.
> htpasswd -c /usr/local/blah/.htpasswd jamesbrown
This makes a new .htpasswd file in the /blah/ directory, along with a record for a user called jamesbrown. You’ll be asked for a password, and it will also be encrypted and stored, using md5 encryption.
If an .htpasswd file already exists at that location, the new user is just added to the existing file, a new one isn’t created.
If you’d rather use the bcrypt hashing algorithm, go with the -b option.
Password hashing without the command line
If you’re only familiar with .htaccess basics and you’d rather not use the command line or SSH terminal for whatever reason, you can just create an .htpasswd file and use a text editor to fill everything in before uploading it using FTP or file manager.
Of course, that leaves you with the task of encrypting the passwords, but that shouldn’t be a problem because there are lots of password encryption programs to be found online. Many other .htaccess tutorials would probably approve of the htpasswd generator at Aspirine.org.
It offers a few choices for of algorithm that will let you determine how strong the password is. Once you run it, copy your hashed password into the .htpasswd file.
You’ll only need one.htpasswd file for all your.htaccess files, so there’s no need to have one for each. One will do the job for the whole main server directory or web-hosting account.
Don’t put your .htpasswd file in a directory that can be accessed by all, so not in public_html or www or any subdirectory. It’s safer from a security standpoint to put it somewhere that can only be accessed from within the server itself.
How to use .htpasswd with .htaccess
If you want to have a .htaccess file for every directory, then you can assign a set of users to have access to it.
If you want to grant universal access then do nothing, as it’s enabled by default.
If you want to limit who can get access, then your .htaccess file should look like this:
AuthName “Name of Secure Area”
<Limit GET POST>
Line one shows the location of where your usernames and passwords are held. Line two defines the name for the area you want to keep secure, and you can call it what you want. Line three specifies “Basic” authentication, which is fine in most instances.
The <Limit> tag defines what is being limited (in this instance, the ability to GET or POST to any file in the directory). Within the pair of <Limit> tags is a list of who is allowed to access files.
In this example, access files can be accessed by any valid user. If you only want certain users to have access you can name them.
AuthName “Name of Secure Area”
<Limit GET POST>
require user janebrown
require user jamesbrown
You can also grant or deny access based on the group that you put users in, which is a real time saver. You can do this by creating a group file and adding names.
Give your group file a name, such as .htpeople, and have it look something like this:
admin: janebrown jamesbrown
staff: zappafrank agrenmorgen
Now it’s become something that you can refer to in your .htaccess file:
AuthName “Admin Area”
<Limit GET POST>
require group admin
.htaccess guide – alternatives to .htpasswd
It only really makes sense to use .htaccess and .htpasswd to limit file access on your server if you’ve got a lot of static files. This approach appeared in the early days of the web when sites were usually made up of a lot of HTML documents and other resources.
If you’re using a content management system (CMS) like WordPress, you’ll have a feature that lets you do this as part of the system.
Enabling Server Side Includes (SSI)
SSI is a simple scripting language which is mainly used for embedding HTML documents into other HTML documents, so frequently-used elements like menus and headers can easily be reused.
<!– include virtual=”header.shtml” –>
It’s also got conditional directives (if, else, etc.) and variables, which makes it a complete scripting language, although one that’s hard to use if you have anything more complicated in your project than one or two includes. If it gets to that point then a developer will usually be reaching for PHP or Perl instead.
.htaccess tutorial – enabling SSI
Server Side Includes are enabled by default with some web-hosting servers. If yours isn’t, you can use your .htaccess file to enable it, like this:
AddType text/html .shtml
AddHandler server-parsed .shtml
Options Indexes FollowSymLinks Includes
This should enable SSI for all files that have the .shtml extension.
You can tell SSI to parse .html files using a directive like this:
AddHandler server-parsed .html
Why bother? Well, this way lets you use SSI without alerting anyone to the fact that you are doing so. On top of that, if you change implementations later, you can hold on to your .html file extensions.
The only fly in the ointment here is that every .html file will then be parsed with SSI, and if you’ve got lots of .html files that aren’t in need of SSI parsing, it makes the server work needlessly harder, bogging it down for no extra benefit.
SSI on your Index page
To avoid parsing every single.html file without using SSI on your index (home) page, you will have to stipulate that in your .htaccess file, because when the web server looks for the index page of the directory it will be hunting for index.html by default.
If you aren’t parsing .html files, you will have to name your index page “named index.shtml” if you want SSI to work, and your server won’t automatically look for it.
To make that happen just add:
DirectoryIndex index.shtml index.html
This lets the web server know that the index.shtml file is the main one for the directory. The second parameter, index.html is a failsafe, which is referred to when it can’t find index.shtml.
IP Blacklisting and IP Whitelisting with .htaccess
If you’ve had problems from certain users from specific IP addresses then there are .htaccess basics that can be used to help you blacklist (block) them, or you can do the opposite and whitelist (approve) everyone from particular addresses if you want to exclude everybody else.
Blacklisting by IP
This will let you blacklist addresses (numbers are examples):
deny from 4126.96.36.199
deny from 735.34.6.
allow from all
The first line says to evaluate the allow directives before the deny directives, which makes allow from all the default state. In this case only those which match the deny directives will be denied.
If you switched it round to say deny,allow, then the last thing it looked at would be the allow from all directive, which would allow everybody, and override the deny statements.
Take note of the third line, which says deny from 735.34.6.— which isn’t a complete IP address, but that’s okay because it denies every IP address in that block (anything that begins with 735.34.6.).
You can include as many IP addresses as you like, one on each line, with a deny from directive.
Whitelisting by IP
The opposite of blacklisting is whitelisting — restricting everyone except those you specify.
As you might suspect, the order directive has to be turned back to front, so that you deny access to everyone at first, but then allow certain addresses after.
deny from all
allow from 188.8.131.52
allow from 789.56.4.
Domain names instead of IP addresses
Users can also be blocked or allowed using a domain name. This is helpful if people are moving between IP addresses, but it won’t work against anyone who has control of their reverse-DNS IP address mapping.
deny from forinstance.com
allow from all
This works for subdomains, too — in the example above, visitors from abc forinstance.com will be blocked as well.
Block Users by Referrer
If a website contains a link to your site and someone follows it to your site, it’s known as a referrer.
But this doesn’t only work for clickable hyperlinks to your website. Any page on the internet can link to your images directly. This is called hotlinking and it steals your bandwidth, it may infringe on your copyright, and you don’t get any extra traffic out of it. And it’s not just images either. A stranger can link to your other resources like CSS files and JS scripts, too.
This is bound to happen a little bit and most site owners tolerate it but it’s the kind of thing that can easily escalate into something more abusive. And there are times when in-text clickable hyperlinks can cause you problems too, like when they’re from troublesome or nefarious websites. These are just a few of the reasons why you might decide to deny requests that originate with particular referrers.
If you need to do this, you’ll have to activate the mod_rewrite module. The majority of web hosts enable it automatically, but if yours doesn’t (or you can’t tell if they have) it’s worth getting in touch to ask. If they’re reluctant to enable it, it’s maybe worth thinking about getting a new host.
.htaccess basics—the directives that perform blocking on a referrer basis depend on the mod_rewrite engine.
The code to block by referrer looks like this:
RewriteCond % ^http://.*forinstance.com [NC,OR]
RewriteCond % ^http://.* forinstance2.com [NC,OR]
RewriteCond % ^http://.* forinstance3.com [NC]
RewriteRule .* – [F]
It’s slightly fiddly, so let’s go through it.
RewriteEngine on, on the first line tells the parser that some rewrite directives are on the way.
Each of lines 2,3 and 4 blocks a single referring domain. To change this for your own purposes you would alter the domain name part (forinstance) and extension (.com).
The back-slash in front of the .com is an escape character. The pattern matching used in the domain name is a standard expression, and the dot has a meaning in RegEx, so it must be “escaped” by utilising the back-slash.
The NC in the brackets is there to specify that the match shouldn’t be case sensitive. The OR literally means “or”, and indicates that more rules are on the way (—“As long as the URL is this one, this one or this one, go along with this rewrite rule.”)
The final line is the rewrite rule itself. The [F] stands for “Forbidden.” If a request comes from a referrer the same as one of the ones on the list, then it will be blocked a 403 Forbidden error will be delivered.
Blocking Bots and Web Scrapers
Sometimes it isn’t even people trying to eat up your bandwidth, it’s robots. These programs come along and lift information from your site, typically so that it can be republished by some low-quality SEO outfit.
There are genuine bots out there, such as the ones that come from the big search engines, but the others are almost like cockroaches, scavenging and doing you no good whatsoever.
To date, many hundreds of bots have been identified. You won’t ever be able to block them all, but it doesn’t hurt to try. Here are some rewrite rules that will trip up 350+ known bots
Specify a Default File for a Directory
When a web server receives a request for a URL but no file name specified, most Web servers will assume that the URL is referring to a directory.
So, if you request http: forinstance.com, Apache (and most other web servers) will look for the domain in the root directory (typically /public_html or something like it, such as /forinstance-com) to find the default file.
The default file will be called index.html (by default!) because when the Internet was young, websites were often just a bunch of documents bundled together, and “home” pages were often no more than an index, so that you knew where everything was.
Of course, nowadays you might not want index.html to be the default page, perhaps because you might want a different file type, so index.shtml, index.xml, or index.php might be more appropriate, or maybe you don’t think of your home page as an “index,” and want to call it something else, like home.html or primary.html.
Set the Default Directory Page
.htaccess basics—this lets you set the default page for a directory with ease:
DirectoryIndex [filename goes here]
If you want your default to be home.html it’s as simple as using:
Setting More Than One Default Page
You can also set more than one DirectoryIndex:
DirectoryIndex index.php index.shtml index.html
The way this works is that the web server looks for the first one first. If it can’t find that, it looks for the second one, and on it goes.
But why would you need to do this? Wouldn’t you know which file you wanted to use as your default page?
Keep in mind that one of the .htaccess basics is that it influences its own directory, and each subdirectory too until it’s overruled by a more local file. So, an .htaccess file in your root directory can give instructions for lots of subdirectories, and they could all have their own default page names.
When you put all those rules into just one .htaccess file in the root it spares you the tedious work of duplicating all the directives it contains at the level of every directory.
URL Rewriting and URL Redirects
.htaccess basics—one of the most common uses of .htaccess files is URL redirects.
If the URL for a document or resource has changed—say because you’ve moved things around on your website or you’ve changed domain names—then URL redirects can help you.
301 or 302
There are two types of redirect error codes that the Web server will generate, namely 301 and 302.
301 tells you that something has been “Permanently Moved,” and 302 means it’s been “Moved Temporarily.” In the majority of cases, 301 does a perfectly good job, and perhaps most importantly it passes any SEO brownie points that the original URL may have picked up on to the new page.
It will also make most browsers update their bookmarks and cache the old-to-new mapping, which lets them request the new URL when the original is being looked for. For a permanently changed URL these are all the responses you want.
There’s not much to be gained from using 302 redirects, because there is rarely a reason to change a URL on a temporary basis. Changing one at all is not something that anybody should really want to do but it sometimes has to be done, and there are usually better options available than changing it just for a while, with the intention of changing it back later.
Redirect or Rewrite
You can change a URL with .htaccess directives in a couple of ways — the Redirect command and the mod_rewrite engine.
The Redirect command tells the browser which other URL it should be looking out for.
The mod_rewrite tool will normally “translate” the URL that’s in the request into something the file system or CMS can understand, then it treats the request as though the translated URL was the one that was requested.
From the perspective of the web browser it’s business as usual. It gets the content it requested and carries on as if nothing happened.
The mod_rewrite tool is also able to produce 301 redirects that work like the Redirect command, but with a greater number of possible rules, including elaborate pattern matching and rewriting instructions, which is beyond what Redirect can do.
Basic Page Redirect
For redirecting one page to another URL, the code looks like this:
Redirect 301 /relative-url.html http://forinstance.com/full-url.html
A single space separates each of the four parts of this one-line command, so you have:
- the Redirect command itself
- its type ( 301 – Moved Permanently )
- the original page’s relative URL
- the full URL of the new page.
The relative URL is relative to the directory that contains the .htaccess file, which will normally be the web root, or the root of the domain.
So, if http://forinstance.com/blog.php had been moved to http://blog.forinstance.com, the code would be:
Redirect 301 /blog.php http://blog. forinstance.com
Redirecting a large section
If you’ve made changes to your directory structure, but not your page names, you may want to redirect all requests for a particular directory to the new one.
Redirect 301 /old-directory http://forinstance.com/new-directory
Redirecting an entire site
But how about if your entire site has moved to a new URL? No problem.
Redirect 301 / http://thenewurl.com
Redirecting www to non-www
More and more websites are turning their back on the www subdomain. There’s never really been a need for it. It’s a throwback to a time when lots of people who ran a website used a server to look after many of their own documents, and the www or “world wide web” directory was where they put anything they wanted to offer up to others.
Some people still use it to this day, but a lot have moved on. But it’s become such a habit for some users who can’t stop themselves from typing www. in front of every single URL that this makes it tricky for you if yours has been shorn of those letters.
But never fear, because the mod_rewrite module can help you with this and you probably have one on your web host’s dashboard already.
RewriteCond % ^www.forinstance.com [NC]
RewriteRule ^(.*)$ http://forinstance.org/$1 [R=301,NC]
Many other .htaccess and mod_rewrite guides will give you some version of this code to achieve this:
RewriteCond % !^forinstance.com [NC]
RewriteRule ^(.*)$ http://forinstcance.org/$1 [R=301,NC]
Can you see what’s wrong with it?
All subdomains are redirected to the primary domain, which means not only www.forinstance.com, but others like blog.forinstance.com and admin.forinstance.com too. Not ideal behaviour!
Redirecting to www
So, what happens if you’re using the www subdomain?
You should probably set up a redirect to make sure people get to where they’re trying to go. Especially now that fewer people are likely to automatically add that www to the beginning of URLs.
All you need to do is reverse the code to achieve this.
RewriteCond % ^forinstance.com [NC
RewriteRule ^(.*) http://www.website.com/$1 [R=301,NC]
One thing not to do:
A number of .htaccess guides recommend redirecting 404 errors to your home page. While this is possible, it doesn’t mean it’s something that you should do. In fact, we’d go so far as to say it’s an awful idea, because it leaves visitors confused. They will have been expecting another page, and instead they get your homepage. A 404-error page would have told them exactly what they needed to know, whereas this does not. And anyway, what’s the problem with admitting that a page can’t be found? There’s no shame in it.
Why use .htaccess basics rather than other approaches? Redirects can be set up with server-side scripting, like in PHP files. They can also be set up from within your Content Management System (which is pretty much the same thing).
But using .htaccess is usually the fastest type of redirect. With PHP-based redirects, or other server-side scripting languages, the entire request must be completed, and the script is actually interpreted before a redirect message is sent to the browser.
As any .htaccess guide will tell you, using .htaccess redirects are much faster because the server responds to each request directly.
Be aware that some CMSs handle redirects by updating the .htaccess file in a programmatic way. WordPress is one example of a system that does this.
This gives you the speed benefits of directly using .htaccess combined with the convenience of managing it from inside your application.
.htaccess Basics – Hiding Your .htaccess File
A concept that should be one of your .htaccess basics is that the file shouldn’t be visible from the web. There’s just no reason for it, apart from perhaps wanting to locate your .htpasswd file. And as another rule of .htaccess basics, random strangers shouldn’t be able to look at details of your implementation, including rewrite rules, directory settings, and security. Hiding all of that stuff makes it more difficult for hackers to work out ways into your system.
Luckily, you can hide your .htaccess file fairly easily using this code:
deny from all
MIME types are types of file. They got their name because they were originally associated with email (it stands for “Multipurpose Internet Mail Extensions”). Don’t just think of them as “file types” because MIME suggests a specific format for specifying the file type.
If you have ever written an HTML document, you’re likely to have specified a MIME type, probably without realising it:
<style type=”text/css” src=”/style.css” />
The type attribute refers to a particular MIME type.
MIME types on your server
Occasionally you might find that your web server isn’t set up to deliver a specific file type, and any requests for that type of file just don’t work.
Usually, you can get around this by putting the MIME type in your .htaccess file.
AddType text/richtext rtx
This directive has three space-separated parts:
The AddType command
The MIME type
The file extension.
You can associate a number of different file extensions with the same MIME type on one line.
AddType video/mpeg mpg MPEG MPG
Force Download by MIME Type
If you want every link to a certain type of file to automatically download rather than just open in your browser, use the MIME type application/octet-stream, like this:
AddType application/octet-stream pdf
As before, you can include numerous file extensions:
AddType application/octet-stream rtf txt pdf docx doc
List of File Extensions and MIME Types
Here’s an incomplete list of file formats and associated MIME types.
If you manage your own website, and you already know the file types of your resources, then you won’t need to paste this whole list into your .htaccess file.
But if you run a site along with lots of other people who could be uploading all sorts of stuff, then this might help to avoid any potential publishing missteps. This is particularly pertinent for file sharing or project management sites where folks are bound to be sharing lots of different files.
AddType application/macbinhex-40 hqx
AddType application/netalive net
AddType application/netalivelink nel
AddType application/octet-stream bin exe
AddType application/oda oda
AddType application/pdf pdf
AddType application/postscript ai eps ps
AddType application/rtf rtf
AddType application/x-bcpio bcpio
AddType application/x-cpio cpio
AddType application/x-csh csh
AddType application/x-director dcr
AddType application/x-director dir
AddType application/x-director dxr
AddType application/x-dvi dvi
AddType application/x-gtar gtar
AddType application/x-hdf hdf
AddType application/x-httpd-cgi cgi
AddType application/x-latex latex
AddType application/x-mif mif
AddType application/x-netcdf nc cdf
AddType application/x-onlive sds
AddType application/x-sh sh
AddType application/x-shar shar
AddType application/x-sv4cpio sv4cpio
AddType application/x-sv4crc sv4crc
AddType application/x-tar tar
AddType application/x-tcl tcl
AddType application/x-tex tex
AddType application/x-texinfo texinfo texi
AddType application/x-troff t tr roff
AddType application/x-troff-man man
AddType application/x-troff-me me
AddType application/x-troff-ms ms
AddType application/x-ustar ustar
AddType application/x-wais-source src
AddType application/zip zip
AddType audio/basic au snd
AddType audio/x-aiff aif aiff aifc
AddType audio/x-midi mid
AddType audio/x-pn-realaudio ram
AddType audio/x-wav wav
AddType image/gif gif GIF
AddType image/ief ief
AddType image/jpeg jpeg jpg jpe JPG
AddType image/tiff tiff tif
AddType image/x-cmu-raster ras
AddType image/x-portable-anymap pnm
AddType image/x-portable-bitmap pbm
AddType image/x-portable-graymap pgm
AddType image/x-portable-pixmap ppm
AddType image/x-rgb rgb
AddType image/x-xbitmap xbm
AddType image/x-xpixmap xpm
AddType image/x-xwindowdump xwd
AddType text/html html htm
AddType text/plain txt
AddType text/richtext rtx
AddType text/tab-separated-values tsv
AddType text/x-server-parsed-html shtml sht
AddType text/x-setext etx
AddType video/mpeg mpeg mpg mpe
AddType video/quicktime qt mov
AddType video/x-msvideo avi
AddType video/x-sgi-movie movie
AddType x-world/x-vrml wrl
Hotlinking is where you link to resources from other domains rather than hosting the files yourself. A good example would be a video that you really like on someone else’s site. You can either download it, upload it to your site (assuming there are no copyright issues of course) and embed it in your page.
The hotlinking route saves you the bother and the bandwidth (and no, that doesn’t mean we condone it—quite the opposite in fact).
This kind of thing also goes on with CSS and JS files, but it happens most commonly with pictures and video.
Sites like Wikipedia don’t mind you doing this, and there are others who want you to do it because it helps their SEO needs.
Then there are the likes of JQuery, which uses a CDN (Content Delivery Network) to share their JS libraries so you don’t have to host them yourself.
But a lot of web hosts see hotlinking as a way of stealing their material and hogging their bandwidth.
If your site’s not all that big, getting thousands of requests every day that don’t bring visitors to your site or benefit you in any way is going to raise your blood pressure. So, if hotlinking is making you hot under the collar, you can block it by adding some mod_rewrite rules to your .htaccess file.
RewriteCond % !^$
RewriteCond % !^http://(www.)?forinstance.com/.*$ [NC]
RewriteRule .(gif|jpg|jpeg|png|js|css)$ – [F]
Don’t forget to change forinstance.com in line 3 to your genuine domain name. This way, any requests that don’t originate from your domain will be caught and checked to see if they match one of the file extensions you’ve identified in line 4. If any match, the request is denied.
You can easily add other file extensions to the list by editing the final line.
Offering Substitute Content
If you want to inform hotlinkers that you take a dim view of their practice you can serve up alternatives to the images (or whatever) that they’re requesting. You could send them an image of a red-hot chain at a foundry with a big cross through it and the phrase “Hotlinking Is Pretty Cold, Bro!” or “Find the Original Content Here at http://forinstance.com”.
So you don’t fail the request, you just redirect it to your alternative image:
RewriteCond % !^$
RewriteCond % !^http://(www.)?forinstance.com/.*$ [NC]
RewriteRule .(gif|jpg)$ http://www.forinstance.com/no-hotlinking.jpg [R,L]
RewriteCond % !^$
RewriteCond % !^http://(www.)?forinstance.com/.*$ [NC]
RewriteRule .(js)$ http://www.forinstance.com/break-everything.js [R,L]
RewriteCond % !^$
RewriteCond % !^http://(www.)?forinstance.com/.*$ [NC]
RewriteRule .(css)$ http://www.forinstance.com/super-ugly.css [R,L]
Disable or Enable Index
What would happen if you had a directory filled with documents or other file types, but you didn’t have an index.html file, and nor did you specify a directory page in the .htaccess file?
Typically, this would result in a generic directory listing of every file it contained.
So, if you’re hosting directory had a folder labeled /videos, but no index.html page, anyone going to http://yousite.com/videos could see a list of every video on your site.
Most web servers do this by default; it’s a throwback to that initial view of the web as a place to store things, but it isn’t what you want to be doing these days.
Lots of web-hosting accounts will disable this as a matter of course, but if yours doesn’t and you need to disable automatically generated directory listings, it’s not hard to do:
You can enable them just by reversing this command:
Keeping some files hidden from the Index
To show directory listings, but hide certain file types from the list, that’s doable as well.
IndexIgnore *.swf *.png
The * is a wild-card character. This directive will hide every file that has a .swf or .png extension. To be more specific, you would say something like:
Enabling CGI Everywhere
CGI, which stands for Common Gateway Interface, is a server-side method that includes non-HTML scripts (like SSI or Perl) in web pages.
CGI scripts are normally kept in a folder named /cgi-bin. The webserver is configured to treat any resource in that directory as a script, instead of a page.
There are two problems with that: URLs which reference CGI resources must have /cgi-bin/ in them, so they can place implementation details into your URL, an inverse pattern you should steer clear of for a few reasons.
An elaborate site might need a better structure than just having a Load of scripts crammed into one/cgi-bin folder.
To get your web server to parse CGI scripts regardless of where they might be sitting in your directory structure, just put this in your .htaccess file:
AddHandler cgi-script .cgi
If you’ve got other file extensions and you’d like them to be processed as CGI scripts, just add them to the first line.
Scripts as Source Code
Most of the time, all the scripts go in your web directory because they need to be run as scripts.
But you don’t always want that. There are times when you want site visitors to be able to view the source code, rather than have it be executed.
One of those times is if your site is a storehouse for code that is there for others to use.
Your .htaccess file can help you do this by stripping out the script handler for particular types of file and putting in a handler for text instead.
RemoveHandler cgi-script .pl .cgi .php .py
AddType text/plain .pl .cgi .php .py
Alternatively, as we said before, you specify that files with these extensions are downloaded by default, instead of being displayed.
RemoveHandler cgi-script .pl .cgi .php .py
AddType application/octet-stream .pl .cgi .php .py
Be on your guard with both of these, though. If you want just some of your files to display this way, but you’re still using these scripts for the rest of your website, putting that directive into your web root’s .htaccess file is going to cause you some headaches.
You’re better off putting scripts that you only want to display into their own designated directory, and then putting the directive into an .htaccess file in that same folder.
Configuring PHP Settings
Sometimes you need to tweak PHP settings, and this is best done using a file called php.ini.
The thing is, some web-hosting companies, particularly shared hosting providers, won’t let their customers do that.
But you can get around this, by embedding php.ini rules in your .htaccess file.
Here’s the syntax:
php_value [setting name] [value]
So, let’s say you want to increase the maximum file upload size. You’d just say:
php_value upload_max_filesize 12M
You can’t specify every PHP setting in a .htaccess file. For instance, you can’t disable_classes like this.
To see a full list of all php.ini settings, check out the official php.ini directives guide.
When Not to Use .htaccess
When you first edit your .htaccess file you can suddenly feel as powerful as a sysadmin.
But try not to let absolute power corrupt you absolutely, because you might find yourself misusing the .htaccess file. When all you have is a hammer, then every task can start to look like a nail, but on at least a couple of occasions when something seems definitely looks like an .htaccess task, your directive would be better off put somewhere else.
Further Up the Tree
When you feel like putting a directive in an .htaccess file, you should probably choose the httpd.conf file, instead. It’s a configuration settings file for the whole server.
The proper home of PHP settings too is the php.ini file, and most languages have their own equivalents.
Putting directives higher up in the tree, in the httpd.conf, php.ini, or other appropriate file for that language means those settings can be embedded in the web server’s parsing engine. With .htaccess, the directives must be checked and interpreted every time there’s a request.
This isn’t so bad if you’re running a low traffic site with just a few .htaccess directives, but it isn’t difficult to see that if your site’s traffic-heavy and it has to churn through a lot of directives, you’re effectively putting the brakes on the whole thing.
It’s a shame that a lot of shared hosting providers won’t let their customers into the httpd.conf or php.ini files, forcing them to settle for the slower .htaccess file.
This doubly penalises them when you compare them side-by-side with custom VPS configurations because shared hosting is also usually under-resourced. That’s why a site with a decent level of traffic would probably be better off with a VPS plan instead of shared hosting.
If you’re using a CMS like WordPress or Drupal, you can do most of the things that you might want to do with a .htaccess file — such as block IP addresses or redirect URLs — from right inside the application.
Often, CMS works hand-in-hand with the .htaccess file, adding directives programmatically.
You’re better off opting for this approach to these tasks when it’s available. You’re much less likely to cause problems doing that than if you edit the .htaccess file yourself.
.htaccess basics – troubleshooting
It can certainly be fun to mess around with your .htaccess file, but if you don’t know your .htaccess basics it can also bring your server to its knees.
If your tinkering does reduce your box to a heap of hot slag, then for goodness sake only try and put it right at the rate of one thing at a time. That means you do something, wait to see what happens, then do the next thing. Panic may make you want to tweak everything that’s tweakable, but this is not the way of science. If something works or doesn’t work, then you’ll want to know which directive was responsible, and you can only test that by changing one of them at a time.
Backup your file before you change anything
The need to back up every time you make one of the above-mentioned adjustments is one of your absolutely crucial .htaccess basics, because you can’t just hit CTRL-Z to take you back to safety if things go south. A good rule of thumb is that you always want to be able to restore to a working version. A source management system like git can help you out here.
Check error logs
If you can’t work out why your problem is a problem, have a look at your Apache error logs. These will usually give you the lowdown on where to look.
Developers on sites like StackOverflow are often only too willing to help you solve your problem. That doesn’t make you a bad web admin. It makes you a good one. Don’t be afraid to ask for help.
Common .htaccess basics problems
Sometimes you entered the wrong character, other times your problem is caused by a bizarre confluence of factors. Some problems are just run-of-the-mill.
Here’s a selection of those.
There is only one way to spell .htaccess – and this is it. Do not deviate from it.
If your .htaccess file is misbehaving, then it’s the first place you should look. Some people don’t think to do this. Don’t be one of them.
.htaccess Disabled, or Partly Disabled
A number of shared hosting providers disable .htaccess completely. Others permit it, but then don’t let you use certain directives. If they’re included, they get ignored.
Equally, even with VPS plans or on your own dedicated servers, .htaccess can wind up being disabled.
You can check this yourself if you are allowed to access the httpd.conf file. If you see the directive AllowOverride None, that’s your guilty party. Swap that sucker out for AllowOverride All.
If you’re not allowed to do this, you’ll probably need to contact your hosting company’s support team and ask if they can do it for you or suggest a workaround.
Conflicting or Overridden Directives
If you’ve got numerous nested directories, they can all have their own .htaccess file. Each of those .htaccess files, right from the root, and down through each nested directory, will be read in order, all the way through the directory tree.
If something in a subdirectory overrides something that you set in your root directory, precedents will be given to the directive in the .htaccess file closest to the requested file.
And that it from us. We hope you’ve enjoyed this .htaccess tutorial.
How useful was this post?
Click on a heart to rate it!
Average rating / 5. Vote count:
No votes so far! Be the first to rate this post.
Oh no, sorry about that!
Let us know how we can do better below
Thanks for your feedback!