Jump to content
MDDHosting Forums

Can't get robots.txt Disallow to work


Recommended Posts

I have 2 sites under shared hosting with MDD. One of them has downloads only meant for people who have purchased a product (domain name is given with it). To block this site entirely I have placed a robots.txt file into its root directory (public_html), the file is as follows...

User-agent: *Disallow: /

In my Google webmaster account when testing under Blocked URLs it says everything is allowed and "robots.txt file does not appear to be valid". I have followed the google guidelines https://support.google.com/webmasters/answer/156449?hl=en

 

No idea what I have missed. Anyone know what the problem could be? Thanks.

Link to comment
Share on other sites

Arpeggio, speaking from experience, if you don't want files to be found, don't put them on a public facing domain. Robots.txt is a suggestion, not a rule or a law that must be followed. Most scripts out there for downloads will hide them above your root folder, or make files only accessible from within a network and not via a domain (ie: so only a server on the web servers domain can be accessed). Here's some documents on the topic:

 

Why did this robot ignore my robots.txt file?

Can I block bad robots?

Can robots.txt be used in a court of law?

Link to comment
Share on other sites

Hi SnakEyez. The files have to be accesible to people who have bought a copy of a book or eBook, they are music method books and include audio as downloadable mp3, the link to which is only in copies of the book / eBook. I'm not sure what you are suggesting would apply to that.

Link to comment
Share on other sites

It would. I don't have it installed on my account at the moment because the domain I had it on is under a major revamp at the moment. But what I did was create a folder called "downloads" above the public_html or www folder on my site. Then the script I was using used a root-relative path to get the files for people who were logged in. Because the folder is above public_html/www Google will never index it. And because the link lies within a member-secured login area, the pages with links were never accessible to the public.

 

Think of it this way. Let's take a Windows computer. You want to run a server from your local computer. If you install something like Apache/IIS or a pre-made package like WAMP, only the folder you specify is web-accessible. So think of testing using the address "http://localhost". You could create a link to that file as in "C:\Desktop\Myfile.jpg". The trick is here is that to the web application, C:\ is a perfectly valid protocol and will link to a real file. But if someone tries to use the address elsewhere it wouldn't work. It's a crude example, but I think it clarifies the point. I had used IP.Downloads before which is an extension of the forum software that MDDHosting uses, but there are other download management web applications that do this for you so that files are never web-facing.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...