Como bloquear Bots desconhecidos do Joomla

9

Como posso bloquear bots desconhecidos que estão consumindo muita largura de banda do meu site Joomla 3.3.6? Não quero bloquear bots do Yahoo, Google e MSN, apenas outros. Eu verifiquei minha recente notícia; hoje, cerca de 10.720 + 265 ocorrências são realizadas por bots desconhecidos e consumiram aproximadamente 1 GB de largura de banda.

Estou ansioso por uma solução positiva.

Naeem
fonte

Respostas:

7

Você pode permitir que apenas bots do Yahoo, Google e MSN rastreiem seu site usando a User-agentpropriedade. Apenas deixe em branco o Disallow:rastreador permitido.

Slurp é o bot do Yahoo.

Por exemplo:

User-agent: Googlebot
Disallow: /administrator
Disallow: /bin/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /layouts/
Disallow: /libraries/
Disallow: /logs/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/
User-agent: googlebot-image
 Disallow: /administrator
Disallow: /bin/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /layouts/
Disallow: /libraries/
Disallow: /logs/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/ 
User-agent: googlebot-mobile
Disallow: /administrator
Disallow: /bin/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /layouts/
Disallow: /libraries/
Disallow: /logs/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/ 
User-agent: MSNBot
Disallow: /administrator
Disallow: /bin/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /layouts/
Disallow: /libraries/
Disallow: /logs/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/ 
User-agent: Slurp
Disallow: /administrator
Disallow: /bin/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /layouts/
Disallow: /libraries/
Disallow: /logs/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/ 
User-agent: yahoo-mmcrawler
Disallow: /administrator
Disallow: /bin/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /layouts/
Disallow: /libraries/
Disallow: /logs/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/
User-agent: psbot
Disallow: /administrator
Disallow: /bin/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /layouts/
Disallow: /libraries/
Disallow: /logs/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/
User-agent: yahoo-blogs/v3.9
Disallow: /administrator
Disallow: /bin/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /layouts/
Disallow: /libraries/
Disallow: /logs/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/ 
User-agent: *
Disallow: /
Disallow: /administrator
Disallow: /bin/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /layouts/
Disallow: /libraries/
Disallow: /logs/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/
zkanoca
fonte
Muito obrigado, como posso fazer isso via arquivo .htaccess? existe alguma lista atualizada de bots ruins?
Naeem 01/01
Crie um arquivo de texto robots.txt no diretório raiz. O Joomla já tem um. Você pode dar uma olhada.
Zkanoca 01/01
Eu acho que ter um arquivo robots.txt para isso é sempre bom e, em seguida, in.htaccess para esses bots que não escutam para bloqueá-los por ip ou agente de usuário
tristanbailey
4

Você pode bloquear vários bots defeituosos conhecidos com este trecho do Master Htaccess do @ Nikosdion :

########## Begin - Common hacking tools and bandwidth hoggers block
## By SigSiu.net and @nikosdion.
# This line also disables Akeeba Remote Control 2.5 and earlier
SetEnvIf user-agent "Indy Library" stayout=1
# WARNING: Disabling wget will also block the most common method for
# running CRON jobs. Remove if you have issues with CRON jobs.
SetEnvIf user-agent "Wget" stayout=1
# The following rules are for bandwidth-hogging download tools
SetEnvIf user-agent "libwww-perl" stayout=1
SetEnvIf user-agent "Download Demon" stayout=1
SetEnvIf user-agent "GetRight" stayout=1
SetEnvIf user-agent "GetWeb!" stayout=1
SetEnvIf user-agent "Go!Zilla" stayout=1
SetEnvIf user-agent "Go-Ahead-Got-It" stayout=1
SetEnvIf user-agent "GrabNet" stayout=1
SetEnvIf user-agent "TurnitinBot" stayout=1
# This line denies access to all of the above tools
deny from env=stayout
########## End - Common hacking tools and bandwidth hoggers block
Seth Warburton
fonte
1

Você também pode procurar em um dos serviços de verificação de ataque do CDN, como o Incapsula. Isso significa alterar o seu DNS, mas eles têm uma boa tela e suporte ao Joomla, para que eu possa acessar sua página e bloquear ou permitir ataques e bots individuais. Eles parecem ter uma lista atualizada de ataques como o JCE e http://www.incapsula.com/?src=6&subelm=holdingbay.co.uk

tristanbailey
fonte