Affiliate Marketing Tutorial 41 | Robots.txt – Telling The Search Engines What They Index

Affiliate Marketing Tutorial 41 | Robots.txt – Telling The Search Engines What They Index

When your site is indexed by the search engines, it is “crawled” by the search engine spiders – GoogleBot, Yahoo Slurp, Bingbot – in order to find all the content on your site, so that other people can find it.

But what if you’ve got sections of your website that you don’t want indexed? The bots dumbly index whatever they can find – they don’t know that, for example, those photos on the hidden part of your site are strictly friends and family only, or that there are certain pages in your website that you’d really rather not have popping up in the search engine listings or being archived by that pesky internet archive bot — like your long-expired special offers. In this lesson we look at robots.txt – telling the search engines what they can and cannot index.

What is the robots.txt file?

Robots.txt is a small text document that lives in the root of your website and tells the “robots” visiting your website which pages they can and cannot access. When one of these “robots” visits your site, the first thing they do is go looking for the robots.txt file. They listen to your requests, and won’t visit pages that you’ve disallowed.

How do you make a robots.txt file?

Decide which areas of your website you want the spiders to index, and which ones you don’t want them crawling through. And decide if there are any bots you would rather not have crawling through your site.

Open up your plaintext editor of choice, create a new, blank text file and save it as robots.txt, then write this information into the file:

To block all spiders from your entire website:

User-agent: *
Disallow: /

To let all spiders see all content on your site:

User-agent: *
Disallow:

To block certain directories:

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /personal/
Disallow: /photos/staffchristmasparty/

To block a certain spider:

User-agent: Googlebot
Disallow: /

To allow a certain spider, while blocking others:

User-agent: Googlebot
Disallow:

User-agent: *
Disallow: /

Tips:

  • You must use a new line for each instruction.
  • Blank lines are used to show separate groups of instructions (as in the last example).
  • The asterisk in the User-agent line has a special meaning in robots.txt and can’t be used as a wildcard; if you wanted to disallow all GIF images on your website, you couldn’t just can’t just go Disallow: *.gif – that won’t work.
  • Your file must be called robots.txt, all in lower-case.
  • Your file must be located in the root directory of your website: www.yoursite.com/robots.txt. That’s where the spiders look when they visit your site, and they won’t find it if you put it anywhere else.

Now simply save your file and upload it to your website.

Robots.txt and your XML sitemap

If you’ve seen our lesson on creating XML sitemaps, you’ll know that your robots.txt file is a really handy place to let the search engines know where that is.

All you have to do is leave a blank line after the last command in your robots.txt file, and then paste this little line:

Sitemap: <http://www.example.com/sitemap.xml>

If you’ve got more than one sitemap, you can enter more than one line.

Sitemap: <http://www.example.com/sitemap1.xml>
Sitemap: <http://www.example.com/sitemap2.xml>
Sitemap: <http://www.example.com/sitemap3.xml>

This way you don’t need to specifically tell each and every search engine where they can find your sitemap. They’ll see it as soon as they look for your robots.txt file, which every polite bot will do when they visit your site anyway.

Things you need to know

Not all spiders honor robots.txt

“Polite” spiders, such as those belonging to the major search engines, are polite and won’t index items you’ve listed in your robots.txt file. However, not all robots are polite (for example, from smaller search engines, or general data scraping bots), so they will collect any and all content anyway.

Your robots.txt is publicly accessible!

Don’t try to use your robots.txt file to hide content on your site – the robots.txt file is able to be viewed by anybody, simply by typing www.yoursite.com/robots.txt into their browser, so anybody can see the things you’ve said you don’t want indexed!

If there’s content on your website that you really, really don’t want anybody else seeing, your best bet is to password-protect that directory. There will usually be a tool to help you do this in your hosting control panel (cPanel or similar). Note that password-protecting your comment (if done right) will also prevent the “unpolite” bots from accessing the content

Lesson Summary

In this lesson we’ve looked at robots.txt – what it is, what it’s used for, and how to create one. We’ve looked at certain things you can do with robots.txt including:

  • Blocking your entire site from indexing
  • Blocking certain directories
  • Blocking certain bots
  • Identifying the location of the sitemap

Learn something new? Share it with your friends!

******

Affiliate Marketing For Beginners Guide- Make $121.71 Per Day With FREE Traffic, Plus FREE Software
https://www.youtube.com/watch?v=frn2jPLsuL8

How To Make Money Online With Facebook – Youtube | Software For Affiliate Marketing | Work From Home Jobs
https://www.youtube.com/watch?v=P_i9PhOwDT0

How To Make Money On Facebook- OCTOSUITE | Facebook Tool Software- Best Way To Make Money From Home
https://www.youtube.com/watch?v=hJuwaJOyjQU

How To Increase Website Traffic- Free Traffic Source Up to 16,364 Targeted Visitors Per Day
https://www.youtube.com/watch?v=dOhN-TLDdhE

Make Money Online With Affiliate Marketing | Clickbank- Just Follow These 3 Simple Steps, I Have Make $200 In 20 Minutes Daily
https://www.youtube.com/watch?v=vEgLaIoHXkM

The Legal Loophole That Can Kill Your Business + How To Solve It In Just 1 Click | WordPress Plugin
https://www.youtube.com/watch?v=–3xY0L-esM

How To Make Money Online With Affiliate Marketing | Clickbank | $1k/day With Free Traffic Keywords
https://www.youtube.com/watch?v=clGYdSX3PnM

Make Money Online Affiliate Marketing- Set Your Instagram and Bring In Traffic, Leads & Sales Today
https://www.youtube.com/watch?v=A6nBeG2ZzyY

Learn How To Make Money With JVZoo | JVZoo Academy | Make Money Online With Affiliate Marketing
https://www.youtube.com/watch?v=x-lRs5gK9vA

Complete ConvertKit Tutorial 2017 | Build Email List Fast | Best Email Marketing Services
https://www.youtube.com/watch?v=zsoykSYL-nc

Affiliate Marketing For Beginners | Clickbank Tutorial | Make $2,293.26 Per Day From My Laptop
https://www.youtube.com/watch?v=tVZSkW1e3qU

Make Money Online Affiliate Marketing For Beginners |Clickbank |$2,228 Weekly By These 4 Simple Step
https://www.youtube.com/watch?v=Pv1Ny1NHZ78

How To Start A Local Facebook Ad Agency | Run Facebook Ads For Local Businesses | Work From Home
https://www.youtube.com/watch?v=X6jUdlkxosM

Make Money Online From Home | FREE DotComSecrets Book | Underground For Growing Your Company Online
https://www.youtube.com/watch?v=oAeaKwWVDIc

Give FREE Book DotCom Secrets | Improve Your Traffic, Conversion n Sales Online | Make Money Online
https://www.youtube.com/watch?v=4aheSNqtMLw

Ship Your FREE Book DotCom Secrets Now | Growing Your Business Online | Make Money Online From Home
https://www.youtube.com/watch?v=-Cdj0XKDVuI

Comments

comments

Leave a Reply

Your email address will not be published. Required fields are marked *