宝塔面板屏蔽垃圾搜索引擎蜘蛛和扫描工具的办法

qiang 2022-04-02WordPress B2优化评论1字数 6389阅读21分17秒

目前除了我们常见的搜索引擎如百度、Google、Sogou、360等搜索引擎之外,还存在其他非常多的搜索引擎,通常这些搜索引擎不仅不会带来流量,因为大量的抓取请求,还会造成主机的CPU和带宽资源浪费,屏蔽方法也很简单,按照下面步骤操作即可,原理就是分析指定UA然后屏蔽。文章源自黄强博客-https://huangqiang.me/90.html

宝塔面板屏蔽垃圾搜索引擎蜘蛛和扫描工具的办法文章源自黄强博客-https://huangqiang.me/90.html

 文章源自黄强博客-https://huangqiang.me/90.html

屏蔽垃圾蜘蛛User-Agent方法一

首先进入宝塔面板,文件管理进入 /www/server/nginx/conf 目录,新建空白文件 kill_bot.conf  然后将以下代码保存到当前文件中。文章源自黄强博客-https://huangqiang.me/90.html

#禁止垃圾搜索引擎蜘蛛抓取
if ($http_user_agent ~* "YYSpider|Mattermost|Discord|CCBot|RepoLookoutBot|tracking|serpstatbot|Pinterestbot|SurdotlyBot|DataForSeoBot|DigExt|HttpClient|MJ12bot|heritrix|EasouSpider|Ezooms|FlightDeckReports|Linguee Bot|Web-Crawler|WellKnownBot|Yellowbrandprotectionbot|ev-crawler|NE Crawler|Facebot|GrapeshotCrawler|SemrushBot|DotBot|MegaIndex.ru|MauiBot|AhrefsBot|MJ12bot|BLEXBot|HubSpot Crawler|CriteoBot|Web-Crawler|web-crawlers|DataForSeoBot|YaK|Mail.RU_Bot|Barkrowler|crawler|SEOkicks-Robot|vxiaotou-spider|telegram|dingtalk|Twitterbot|DuckDuckGo|applebot|webprosbot|AwarioBot|Amazonbot|AmazonAdBot|YouBot
|ZaldomoSearchBot|SanCheezeBot|MRGbot|Squirrel-spider|t3versionsBot|Internet-structure-research-project-bot|Birdcrawlerbot|webprosbot|Facebot|tracking bot|coccocbot|Cocolyzebot|Amazonbot|RedirectBot|vuhuvBot|domainsbot|NE Crawler|CCBot|oBot|BLEXBot|Orbbot|Neevabot|DataForSeoBot|VelenPublicWebCrawler|LightspeedSystemsCrawler|PetalBot|CheckMarkNetwork|Synapse|Nimbostratus-Bot|Dark|scraper|LMAO|Hakai|Gemini|Wappalyzer|masscan|crawler4j|Mappy|Center|eright|aiohttp|MauiBot|Crawler|researchscan|Dispatch|AlphaBot|Census|ips-agent|NetcraftSurveyAgent|ToutiaoSpider|EasyHttp|Iframely|sysscan|fasthttp|muhstik|DeuSu|mstshash|HTTP_Request|ExtLinksBot|package|SafeDNSBot|CPython|SiteExplorer|SSH|MegaIndex|BUbiNG|CCBot|NetTrack|Digincore|aiHitBot|SurdotlyBot|null|SemrushBot|Test|Copied|ltx71|Nmap|DotBot|AdsBot|InetURL|Pcore-HTTP|PocketParser|Wotbox|newspaper|DnyzBot|redback|PiplBot|SMTBot|WinHTTP|Auto Spider 1.0|GrabNet|TurnitinBot|Go-Ahead-Got-It|Download Demon|Go!Zilla|GetWeb!|GetRight|libwww-perl|Cliqzbot|MailChimp|SMTBot|Dataprovider|XoviBot|linkdexbot|SeznamBot|Qwantify|spbot|evc-batch|zgrab|Go-http-client|FeedDemon|JikeSpider|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|CoolpadWebkit|Java|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|HttpClient|MJ12bot|EasouSpider|LinkpadBot|Ezooms") {
return 403;
break;
}

#禁止扫描工具客户端
if ($http_user_agent ~* "crawl|curb|git|Wtrace|Scrapy" ) {
return 403;
break;
}

保存后返回到宝塔 - 【网站】-【设置】点击左侧 【配置文件】选项卡,在 #SSL-START SSL 相关配置,请勿删除或修改下一行带注释的404规则 上方空白行插入代码: include kill_bot.conf; 保存后即可生效,这样这些蜘蛛或工具扫描网站的时候就会提示403禁止访问。文章源自黄强博客-https://huangqiang.me/90.html

屏蔽垃圾蜘蛛User-Agent方法二

安装宝塔Nginx防火墙。进入设置软件管理 → 全局配置 → 黑白名单 → UA白名单(User-Agent黑名单) → 设置。填入下面的规则,随便写描述后点击添加即可。文章源自黄强博客-https://huangqiang.me/90.html

垃圾蜘蛛规则列表:文章源自黄强博客-https://huangqiang.me/90.html

www.seokicks.de
YYSpider
Mattermost
Discord
CCBot
RepoLookoutBot
tracking
serpstatbot
Pinterestbot
SurdotlyBot
DataForSeoBot
DigExt
HttpClient
MJ12bot
heritrix
EasouSpider
Ezooms
FlightDeckReports
Linguee Bot
Web-Crawler
WellKnownBot
Yellowbrandprotectionbot
ev-crawler
NE Crawler
Facebot
GrapeshotCrawler
SemrushBot
DotBot
MegaIndex.ru
MauiBot
AhrefsBot
MJ12bot
BLEXBot
HubSpot Crawler
CriteoBot
Web-Crawler
web-crawlers
DataForSeoBot
YaK
Mail.RU_Bot
Barkrowler
crawler
SEOkicks-Robot
vxiaotou-spider
telegram
dingtalk
Twitterbot
DuckDuckGo
applebot
webprosbot
AwarioBot
Amazonbot
AmazonAdBot
YouBot
ZaldomoSearchBot
SanCheezeBot
MRGbot
Squirrel-spider
t3versionsBot
Internet-structure-research-project-bot
Birdcrawlerbot
webprosbot
Facebot
tracking bot
coccocbot
Cocolyzebot
Amazonbot
RedirectBot
vuhuvBot
domainsbot
NE Crawler
CCBot
oBot
BLEXBot
Orbbot
Neevabot
DataForSeoBot
VelenPublicWebCrawler
LightspeedSystemsCrawler
PetalBot
CheckMarkNetwork
Synapse
Nimbostratus-Bot
Dark
scraper
LMAO
Hakai
Gemini
Wappalyzer
masscan
crawler4j
Mappy
Center
eright
aiohttp
MauiBot
Crawler
researchscan
Dispatch
AlphaBot
Census
ips-agent
NetcraftSurveyAgent
ToutiaoSpider
EasyHttp
Iframely
sysscan
fasthttp
muhstik
DeuSu
mstshash
HTTP_Request
ExtLinksBot
package
SafeDNSBot
CPython
SiteExplorer
SSH
MegaIndex
BUbiNG
CCBot
NetTrack
Digincore
aiHitBot
SurdotlyBot
null
SemrushBot
Test
Copied
ltx71
Nmap
DotBot
AdsBot
InetURL
Pcore-HTTP
PocketParser
Wotbox
newspaper
DnyzBot
redback
PiplBot
SMTBot
WinHTTP
Auto Spider 1.0
GrabNet
TurnitinBot
Go-Ahead-Got-It
Download Demon
Go!Zilla
GetWeb!
GetRight
libwww-perl
Cliqzbot
MailChimp
SMTBot
Dataprovider
XoviBot
linkdexbot
SeznamBot
Qwantify
spbot
evc-batch
zgrab
Go-http-client
FeedDemon
JikeSpider
Indy Library
Alexa Toolbar
AskTbFXTV
AhrefsBot
CrawlDaddy
CoolpadWebkit
Java
UniversalFeedParser
ApacheBench
Microsoft URL Control
Swiftbot
ZmEu
jaunty
Python-urllib
lightDeckReports Bot
YYSpider
DigExt
HttpClient
MJ12bot
EasouSpider
LinkpadBot
Ezooms

如果安装的是宝塔linux防火墙免费版。 进入软件管理 → linux防火墙免费版→ 全局配置 → User-Agent过滤。文章源自黄强博客-https://huangqiang.me/90.html

填入下面的规则:文章源自黄强博客-https://huangqiang.me/90.html

(www.seokicks.de|YYSpider|Mattermost|Discord|CCBot|RepoLookoutBot|tracking|serpstatbot|Pinterestbot|SurdotlyBot|DataForSeoBot|DigExt|HttpClient|MJ12bot|heritrix|EasouSpider|Ezooms|FlightDeckReports|Linguee Bot|Web-Crawler|WellKnownBot|Yellowbrandprotectionbot|ev-crawler|NE Crawler|Facebot|GrapeshotCrawler|SemrushBot|DotBot|MegaIndex.ru|MauiBot|AhrefsBot|MJ12bot|BLEXBot|HubSpot Crawler|CriteoBot|Web-Crawler|web-crawlers|DataForSeoBot|YaK|Mail.RU_Bot|Barkrowler|crawler|SEOkicks-Robot|vxiaotou-spider|telegram|dingtalk|Twitterbot|DuckDuckGo|applebot|webprosbot|AwarioBot|Amazonbot|AmazonAdBot|YouBot
|ZaldomoSearchBot|SanCheezeBot|MRGbot|Squirrel-spider|t3versionsBot|Internet-structure-research-project-bot|Birdcrawlerbot|webprosbot|Facebot|tracking bot|coccocbot|Cocolyzebot|Amazonbot|RedirectBot|vuhuvBot|domainsbot|NE Crawler|CCBot|oBot|BLEXBot|Orbbot|Neevabot|DataForSeoBot|VelenPublicWebCrawler|LightspeedSystemsCrawler|PetalBot|CheckMarkNetwork|Synapse|Nimbostratus-Bot|Dark|scraper|LMAO|Hakai|Gemini|Wappalyzer|masscan|crawler4j|Mappy|Center|eright|aiohttp|MauiBot|Crawler|researchscan|Dispatch|AlphaBot|Census|ips-agent|NetcraftSurveyAgent|ToutiaoSpider|EasyHttp|Iframely|sysscan|fasthttp|muhstik|DeuSu|mstshash|HTTP_Request|ExtLinksBot|package|SafeDNSBot|CPython|SiteExplorer|SSH|MegaIndex|BUbiNG|CCBot|NetTrack|Digincore|aiHitBot|SurdotlyBot|null|SemrushBot|Test|Copied|ltx71|Nmap|DotBot|AdsBot|InetURL|Pcore-HTTP|PocketParser|Wotbox|newspaper|DnyzBot|redback|PiplBot|SMTBot|WinHTTP|Auto Spider 1.0|GrabNet|TurnitinBot|Go-Ahead-Got-It|Download Demon|Go!Zilla|GetWeb!|GetRight|libwww-perl|Cliqzbot|MailChimp|SMTBot|Dataprovider|XoviBot|linkdexbot|SeznamBot|Qwantify|spbot|evc-batch|zgrab|Go-http-client|FeedDemon|JikeSpider|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|CoolpadWebkit|Java|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|HttpClient|MJ12bot|EasouSpider|LinkpadBot|Ezooms)

文章源自黄强博客-https://huangqiang.me/90.html

weinxin
👈扫一扫加强子微信👍
持续互联网创业10年,追求实战落地,乐于分享利他。欢迎加微信好友,一起交流学习!
qiang
匿名

发表评论

匿名网友 填写信息

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: