If you are a business owner and use the WordPress website to communicate with your customers, it is vital for you to promote it in the search engines. Search engine optimization includes a lot of important steps. One of them is building a good robots.txt file.
What do you need this file for? What is its role? Where is it located on your WordPress website? What are the ways to create it?
Let’s review closer.
What is a robots txt file?
When you build a new website, search engines like Google, Bing, etc. use special bots to scan it. After that, it creates a detailed map of all its pages. It helps them to detect which pages to show when someone enters the search request using the relevant keywords.
The problem is that modern websites contain many other elements besides pages. For example, WordPress allows you to install plugins that often have their own directories. It is not a good idea to show them on the search results page, because these folders contain sensitive content, which might be a big security risk for the site.
To configure which folders to scan, most website owners use the WordPress robots.txt file which provides a set of guidelines for the search engine bots. You can configure what folders can be scanned and which ones must remain hidden from the search bots. This file can be as detailed as you want and it is very easy to create.
In practice, search engines will still scan your site, even if you do not create a robots.txt file. However, not creating it is a very irrational step. Without this file, you allow the search robots to index all the content of your site and they decide that you can show all parts of your site, even those that you would like to hide from public access.
A more important point, without a WordPress robots txt file, the search bots will access your website too often. This will adversely affect its performance. Even if the attendance of your site is still small, the page loading speed is something that should always be in priority and at the highest level. After all, there are only a few things that people do not like more than slow website loading.
Where is the robots.txt file for WordPress located?
When you create a WordPress website, the server automatically creates a robots.txt file and locates it in your root directory on the server. For example, if your website address is example.com, you can find it at example.com/robots.txt. You are able to open and edit it with any text editor. It will contain the lines like the following:
- User-agent: *
- Disallow: /wp-admin/
- Disallow: /wp-includes/
This is an example of the simplest basic robots.txt file. Translating into human language, the right-hand side after the User-agent: declares which robots the rules are for. An asterisk means that the rule is universal and applies to all bots. In this case, the file tells the bots that they cannot scan the wp-admin and wp-includes directories. The meaning of these rules is that these directories contain a lot of files that require protection from public access.
Of course, you can add more rules to your file. Before you do this, you need to understand that this is a virtual file. Usually, the WordPress robots.txt file is located in the root directory, which is often called public_html, www, or by the name of your site name:
You can use any FTP manager like FileZilla to access this file and upload the new version to the server. All you need is to know the login and password for FTP connection. You can contact the technical support to learn more.
Some basic requirements for a WordPress robots txt file
- It must be available in the root corner of the website. Its address will look like example.com/robots.txt.
- File size must not exceed 32 kilobytes.
- The text must contain only Latin symbols. If your domain name uses other symbols, use the special software to correctly transcribe it into Latin symbols.
Do not forget that:
- txt instructions are advisory in nature.
- txt settings do not affect other sites (in robots.txt, you can close only the pages or files on the current site).
- txt commands are case-sensitive.
Types of robots.txt instructions to search robots:
- Partial access to particular parts of the website.
- Full scan disallowance.
When should you use robots.txt?
Using the WordPress robots txt file, we can close pages from search robots that you don’t want to be indexed, for example:
- pages with personal user information;
- pages with documentation and service information that does not affect how the interface is displayed on the screen;
- certain file types, for example, PDF files;
- WordPress dashboard, etc.
The structure of a robots.txt file
The webmaster can create the WordPress robots txt file using any text editor. Its syntax includes three main elements:
1 User-agent: [name of the search robot]
2 Disallow: [path you want to close access to]
3 Allow: [path you want to open access to]
Moreover, the file may also contain two additional elements:
1 Sitemap: [site map address]
Then, place the created robots.txt file in the root directory of the website. If your website uses the primary domain, the file will be located in /public_html/ or /www/ folder. It depends on the hosting provider. In some cases, it can be a bit different, but most companies use the mentioned structure. If the domain is additional, the folder name will include the website name and look like /example.com/.
To place the file in the appropriate folder, you will need an FTP client (for example, FileZilla) and access to FTP, which the provider gives you when you buy a hosting plan.
All the instructions are perceived by robots as a whole and apply only to those search robots that were listed in the first line. In total, there are about 300 different search robots. If you want to apply the same rules to all search robots, then in the “User-agent” field it is enough to put an asterisk (*). This character means any sequence of characters. As a result, it will look like the following:
1 User-agent: *
This command provides recommendations to search robots on which parts of the site should not be scanned. If in the robots.txt you put Disallow: /, it will close all website content from scanning. If you need to close a specific folder from scanning, use the Disallow: /folder.
Similarly, you can hide a specific URL, file, or a specific file format. For example, if you need to close all PDF files on the site from being indexed, you need to write the following instructions in WordPress robots txt:
1 Disallow: /*.pdf$
An asterisk before a file extension means any sequence of characters (any name), and the dollar sign at the end indicates that you close from indexing only the files with a .pdf extension.
In the following reference materials from Google, you will find the list of commands to block the URLs in a robots.txt file.
This command allows you to scan any file, folder, or page. Let’s suppose it is necessary to open for scanning by robots only the pages containing the word /other and close all other content. In this case, use the following combination:
1 User-agent: *
2 Allow: /other
3 Disallow: /
The Allow and Disallow rules are sorted by the URL prefix (from the shortest to the longest) and are applied sequentially. In the example, there would be the following order of instructions: first, the robot would scan Disallow: /, and then Allow: /other, that is, the /other folder would be indexed.
Typical mistakes in the robots.txt file
Wrong order of commands. There should be a clear logical sequence of instructions. First User Agent, then Allow and Disallow. If you allow the entire site but disallow any separate sections or files, then first put Allow, and Disallow after it. If you disallow the whole section but want to open some of its parts, then Disallow will be positioned higher than Allow.
Multiple folders or directories in one Allow or Disallow instruction. If you want to register several different Allow and Disallow instructions in the robots.txt file, then enter each of them from a new line:
Incorrect file name. The name must be exclusive “robots.txt”, consisting only of lowercase Latin letters.
Empty User-agent rule. If you want to set general instructions for all robots, then put an asterisk.
Syntax errors. In case you mistakenly specified one of the additional syntax elements in one of the instructions, the robot may misinterpret them.
How to create the robots.txt file for your WordPress website
As soon as you decide to create your robots.txt file, all you need is to find a way to create it. You can edit robots.txt in WordPress using the plugin or do it manually. In this section, we will teach you how to use the two most popular plugins for this task and discuss how to create and download a file manually. Let’s go!
1. Using the Yoast SEO plugin
Yoast SEO plugin is very popular to be introduced. This is the most well-known SEO plugin for WordPress, it allows you to improve the posts and pages to use the keywords in a better way. In addition, it will also rate your content readability, and this will increase the potential audience. Many developers admire the Yoast SEO plugin due to its simplicity and convenience.
One of its basic features is building a robots.txt file for your website. Once you install and activate the plugin, go to the SEO — Tools tab in the plugin console and find the File Editor parameter:
By clicking on this link, you can edit the .htaccess file without leaving the admin console. There is also a Create a robots.txt file button:
After clicking on the button in the tab, the plugin will display the new editor, where you can directly edit your robots.txt file. Please note that Yoast SEO sets its default rules that override the rules of an existing virtual robots.txt file.
After deleting or adding rules, click the Save Changes button in robots.txt to apply them:
That’s all! Let’s now look at another popular plugin that will allow performing the same task.
2. Using the All-in-One SEO Pack plugin
The All-in-One SEO Pack Plugin is another great WordPress plugin for search engine optimization. It includes most of the features of the Yoast SEO plugin, but some website owners prefer it because it is more lightweight. As for creating the robots.txt file, creating it in this plugin is also easy.
After installing the plugin, go to All in One SEO — Manage Modules in the console. Inside, you will find the Robots.txt option with the big blue Activate button at the right bottom. Click on it:
Now, you will be able to find a new Robots.txt tab in the All in One SEO menu. Click on it to see the settings for adding new rules to your file. Next, save the changes or delete everything:
Please note that unlike Yoast SEO, which allows you to enter everything you want, you cannot directly modify the robots.txt file with this plugin. The file content will be inactive. You will just see the gray background.
But, as it is very easy to add the new rules, this fact should not upset you. More importantly, the All in One SEO Pack also includes a feature that helps you block the “bad” bots. You can find it in the All in One SEO tab:
That’s all you need to do if you choose this method. Now let’s talk about how to create a WordPress robots txt file manually if you do not want to install an additional plugin just for this task.
Creating and uploading a robots.txt file for WordPress via FTP
To create the robots.txt file manually, open your favorite editor (like Notepad or TextEdit) add all the necessary commands, and save the file with the txt extension into the local drive. It will literally take a few seconds, so you may want to create a robots.txt for WordPress without using a plugin.
Here is a quick example of such a file:
Once you have made your own file, you need to connect to your site via FTP and put the file into the root folder. In most cases, it is a public_html or www directory. You can upload the file either by clicking the right mouse button on the file in the local FTP manager or simply by dragging and dropping the file:
It also takes a few seconds. As you can see, this method is no more difficult than using the plugin.
How to test robots.txt file for your WordPress website
Now, it is time to check your robots.txt file for errors in the Google Search Console. Search Console is one of the Google tools designed to help you keep track of how your content appears on the search results page. One of these tools checks robots.txt, you can use it by going to your Robots.txt file in the Check Tool section of your console:
Here you will find the editor field where you can add the code for your WordPress robots.txt file, and click Submit. The Google Search Console will ask whether you want to use the new code or download the file from your site. Select the Ask Google to Update option to post the code manually:
Now the platform will check your file for errors. If it finds an error, it will immediately notify you.
WordPress robot txt file is a very powerful tool to increase the website visibility for search engine bots. Being so important, it is not very difficult to create. Is there any ideal file? We cannot say so. It will differ depending on your website content and what result you want to achieve.
I am a Co-Founder at WPOven INC currently living in Vancouver, Canada. My interests range from Web Development to Product development and Client projects. I am also interested in web development, WordPress, and entrepreneurship.