Welcome! Log In Create A New Profile

Advanced

Serving an alternate robots.txt for SSL requests.

Posted by Juan Fco. Giordana 
Juan Fco. Giordana
Serving an alternate robots.txt for SSL requests.
January 07, 2009 04:25PM
Hello list,

I'm using nginx-0.6.34 with the try_files patch applied and I'm trying
to serve an alternate robots.txt for requests on port 443 so pages under
secure connections are not shown by web crawlers.

I've tried many different approaches and couldn't get any of them to
work as I expected:

if ($server_port = 443)
if ($remote_port = 443)
if ($scheme = https)
With and without the location block.
if () blocks inside @try_files rule.
redirect flags: break, last, pemanent.
All rewrite rules disabled except the one in question.

server {
[...]
location /robots.txt {
if ($server_port = 443) {
rewrite ^robots\.txt$ robots_ssl.txt last;
}
}
[...]
}

Most of these approaches returned always robots.txt on both SSL and
NON-SSL while others 404 error under SSL. None robots_ssl.txt.

Am I doing something wrong?

Thanks!
Nick Pearson
Re: Serving an alternate robots.txt for SSL requests.
January 07, 2009 04:45PM
Hi Juan,

Try using two server directives -- one for http and one for https. The
server directive chosen depends on the port that is requested. Something
like this:

server {
listen 80; # http
server_name www.yoursite.com;
[...]
location /robots.txt {
break;
}
}
server {
listen 443; # https
server_name www.yoursite.com;
[...]
location /robots.txt {
rewrite (.*) /robots_ssl.txt;
}
}


On Wed, Jan 7, 2009 at 9:06 AM, Juan Fco. Giordana
<[email protected]>wrote:

> Hello list,
>
> I'm using nginx-0.6.34 with the try_files patch applied and I'm trying to
> serve an alternate robots.txt for requests on port 443 so pages under secure
> connections are not shown by web crawlers.
>
> I've tried many different approaches and couldn't get any of them to work
> as I expected:
>
> if ($server_port = 443)
> if ($remote_port = 443)
> if ($scheme = https)
> With and without the location block.
> if () blocks inside @try_files rule.
> redirect flags: break, last, pemanent.
> All rewrite rules disabled except the one in question.
>
> server {
> [...]
> location /robots.txt {
> if ($server_port = 443) {
> rewrite ^robots\.txt$ robots_ssl.txt last;
> }
> }
> [...]
> }
>
> Most of these approaches returned always robots.txt on both SSL and NON-SSL
> while others 404 error under SSL. None robots_ssl.txt.
>
> Am I doing something wrong?
>
> Thanks!
>
>
Juan Fco. Giordana
Re: Serving an alternate robots.txt for SSL requests.
January 14, 2009 12:35AM
Thank you Nick for your help,

I've followed your suggestions and it worked as expected.

I've changed the rewrite rule since I don't need to capture anything there.

server {
listen 443;
[...]
location = /robots.txt {
rewrite ^ /robots_ssl.txt last;
}
}

Does anybody know if this is possible to do within a single server
context that handle both protocols in version 0.7.*?

Thanks.

On 2009-01-07 15:23:26 Nick Pearson wrote:
> Hi Juan,
>
> Try using two server directives -- one for http and one for https. The
> server directive chosen depends on the port that is requested. Something
> like this:
>
> server {
> listen 80; # http
> server_name www.yoursite.com;
> [...]
> location /robots.txt {
> break;
> }
> }
> server {
> listen 443; # https
> server_name www.yoursite.com;
> [...]
> location /robots.txt {
> rewrite (.*) /robots_ssl.txt;
> }
> }
Dave Cheney
Re: Serving an alternate robots.txt for SSL requests.
January 14, 2009 01:15AM
> Does anybody know if this is possible to do within a single server
> context that handle both protocols in version 0.7.*?
>
> Thanks.
>

I prefer to put my vhost definitions in a seperate file so my version of
this would look something like this

server {
listen 80;
include vhost.d/vhost.conf;
}

server {
listen 443;
include ssl.conf;

location = /robots.txt { ... }

include vhost.d/vhost.conf;
}

Cheers

Dave
Juan Fco. Giordana
Re: Serving an alternate robots.txt for SSL requests.
January 14, 2009 04:15AM
I actually do the same. The thing is I just don't like that.

I tend to use the exact same configurations on both SSL and non-SSL
connections to allow users browsing the site in a safe way if they want to.

But duplicate conf seems to lead to problems.

Dave Cheney wrote:
> I prefer to put my vhost definitions in a seperate file so my version of
> this would look something like this
>
> server {
> listen 80;
> include vhost.d/vhost.conf;
> }
>
> server {
> listen 443;
> include ssl.conf;
>
> location = /robots.txt { ... }
>
> include vhost.d/vhost.conf;
> }
>
> Cheers
>
> Dave
>
Igor Sysoev
Re: Serving an alternate robots.txt for SSL requests.
January 14, 2009 08:25AM
On Tue, Jan 13, 2009 at 09:15:07PM -0200, Juan Fco. Giordana wrote:

> Thank you Nick for your help,
>
> I've followed your suggestions and it worked as expected.
>
> I've changed the rewrite rule since I don't need to capture anything there.
>
> server {
> listen 443;
> [...]
> location = /robots.txt {
> rewrite ^ /robots_ssl.txt last;
> }
> }
>
> Does anybody know if this is possible to do within a single server
> context that handle both protocols in version 0.7.*?

If your servers are different only in this part, then in 0.7 you can

server {
listen 80;
listen 443 default ssl;

location = /robots.txt {
if ($server_port = 443) { # or ($scheme = https)
rewrite ^ /robots_ssl.txt last; # or "break;"
}
}

...

If the servers have many differences, then it's better to use
separate servers. In this case you do not need rewrite, use just alias:

server {
listen 443;

location = /robots.txt {
alias /path/to/robots_ssl.txt;
}

Yet another way (the better than with if/rewrite):

map $scheme $robots {
default robots.txt;
https robots_ssl.txt;
}

server {
listen 80;
listen 443 default ssl;

location = /robots.txt {
alias /path/to/$robots;
}

...


> Thanks.
>
> On 2009-01-07 15:23:26 Nick Pearson wrote:
> >Hi Juan,
> >
> >Try using two server directives -- one for http and one for https. The
> >server directive chosen depends on the port that is requested. Something
> >like this:
> >
> >server {
> > listen 80; # http
> > server_name www.yoursite.com;
> > [...]
> > location /robots.txt {
> > break;
> > }
> >}
> >server {
> > listen 443; # https
> > server_name www.yoursite.com;
> > [...]
> > location /robots.txt {
> > rewrite (.*) /robots_ssl.txt;
> > }
> >}
>

--
Igor Sysoev
http://sysoev.ru/en/
Dave Cheney
Re: Serving an alternate robots.txt for SSL requests.
January 14, 2009 10:00AM
That is AWESOME Igor.

Cheers

Dave

On 14/01/2009, at 5:57 PM, Igor Sysoev wrote:

> If your servers are different only in this part, then in 0.7 you can
>
> server {
> listen 80;
> listen 443 default ssl;
>
> location = /robots.txt {
> if ($server_port = 443) { # or ($scheme =
> https)
> rewrite ^ /robots_ssl.txt last; # or "break;"
> }
> }
Juan Fco. Giordana
Re: Serving an alternate robots.txt for SSL requests.
January 14, 2009 01:10PM
Thanks again Igor,

Yes, that was exactly the first approach I tried. A couple months ago I
was quite impressed when that feature came out but needed to downgrade
to 0.6 because before going to production :)

I knew it was possible but didn't remember/know if there were support
for this on 0.6.

Thanks a lot and hope this clarify the doubts of others.

Igor Sysoev wrote:
> If your servers are different only in this part, then in 0.7 you can
>
> server {
> listen 80;
> listen 443 default ssl;
>
> location = /robots.txt {
> if ($server_port = 443) { # or ($scheme = https)
> rewrite ^ /robots_ssl.txt last; # or "break;"
> }
> }
>
Sorry, only registered users may post in this forum.

Click here to login