HTTPS surface reachable (robots ✓, sitemap ✓, title ✗)
Why it matters: Public files — robots.txt, sitemap.xml, head meta — are what attackers see first during reconnaissance. Misadvertised paths, stale sitemaps, and verbose generators leak more than intended (ISO 27001 A.8.9).
robots.txt
present
# Production Robots.txt file
User-agent: *
#Special parameters
Disallow: /etc.clientlibs/settings/wcm/designs/telegraph/core/clientlibs/save-article.
Disallow: /etc.clientlibs/settings/wcm/designs/telegraph/core/clientlibs/page-refresh.
Disallow: /*?mobile=true
Disallow: /*?mobile=basic
Disallow: /*?ModPagespeed=noscript
Disallow: /*_jcr_content*
Disallow: /*?source=rss
Disallow: /puzzles/puzzle/*?source=
# Internal Search
Disallow: /search/
Allow: /search/$
# Special areas
Disallow: /news/main.jhtml
Disallow: /p/*/embed/
Disallow: /secure/login/*
Disallow: /content/telegraph/
Disallow: /customer/secure/checkout/tesco/
Disallow: /customer/secure/reset-password/
Disallow: /telegraph/*
Disallow: /news-app/*
Disallow: /amp$
Disallow: */application/*
Disallow: */ixale/
Disallow: /core/Content/
Disallow: /promotions/emails/
Disallow: /r/
Disallow: /sponsored/travel/msc-cruises/
Disallow: /travel/8711559/The-Telegraph-Travel-Awards-2011.html
Disallow: /travel/hotel/e/*
Disallow: /sponsored/staging/
Disallow: /sponsored/business/lloyds-tsb-enterprise-awards/
Disallow: /sponsored/earth/statoil/
Disallow: /sponsored/motoring/alfa-romeo-cars/
Disallow: /sponsored/motoring/vw-up/
Disallow: /sponsored/property/all-saints-eastbourne/
Disallow: /sponsored/supplement-portfolio/
Disallow: /sponsored/travel/cunard-cruises/
Disallow: /sponsored/travel/cruise-holidays/
Disallow: /sponsored/travel/macau/macaumap/
Disallow: /sponsored/travel/telegraph-cottages/
Disallow: /sponsored/finance/spread-betting/
Disallow: /sponsored/finance/retirement-annuity/
Disallow: /sponsored/travel/hidden-britain/
Disallow: /sponsored/business/sme-business-essentials/
Disallow: /sponsored/in-the-know/london-cultural-attractions
Disallow: /sponsored/in-the-know/london-dining
Disallow: /sponsored/in-the-know/london-entertainment
Disallow: /sponsored/in-the-know/london-lifestyle
Disallow: /sponsored/in-the-know/london-nightlife
Disallow: /sponsored/in-the-know/london-shopping
Disallow: /sponsored/in-the-know/london-sport-activities
Disallow: /sponsored/in-the-know/london-transport-accommodation
Disallow: /sponsored/in-the-know/london-video-guides
Disallow: /sponsored/motoring/suzuki-motorbikes/
Disallow: /sponsored/technology/cool-list/
Disallow: /travel/hotels/hotel-finder/
Disallow: /podcasts-more/
Disallow: /secure/register/
Allow: /travel/hotels/hotel-finder/$
Disallow: /martech/js/
Disallow: /martech/css/
Disallow: /martech-content/
Disallow: /bin/telegraph/recombee-config
Disallow: /*&p=
Disallow: /customer/subscription/*?
#Bots which make unnecessary bot traffic
User-Agent: endeca
Disallow: /archive/
Disallow: /search/*
User-agent: AI2Bot
Disallow: /
User-agent: Ai2Bot-Dolma
Disallow: /
User-agent: Amazonbot
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: Applebot-Extended
Disallow: /
User-agent: bedrockbot
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: ChatGLM-Spider
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: Claude-SearchBot
Disallow: /
User-agent: Claude-User
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: cohere-ai
Disallow: /
User-agent: Cotoyogi
Disallow: /
User-agent: DeepSeekBot
Disallow: /
User-agent: Diffbot
Disallow: /
User-agent: DuckAssistBot
Disallow: /
User-agent: FacebookBot
Disallow: /
User-agent: FriendlyCrawler
Disallow: /
User-agent: Google-CloudVertexBot
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: GoogleOther
Disallow: /
User-agent: GoogleOther-Image
Disallow: /
User-agent: GoogleOther-Video
Disallow: /
User-agent: GPTBot
Disallow: /
User-agent: Grok
Disallow: /
User-agent: iaskspider/2.0
Disallow: /
User-agent: ICC-Crawler
Disallow: /
User-agent: ImagesiftBot
Disallow: /
User-agent: img2dataset
Disallow: /
User-agent: ISSCyberRiskCrawler
Disallow: /
User-agent: Kangaroo Bot
Disallow: /
User-agent: KunatoCrawler
Disallow: /
User-agent: Meltwater
Disallow: /
User-agent: Meta-ExternalAgent
Disallow: /
User-agent: Meta-ExternalFetcher
Disallow: /
User-agent:
sitemap.xml
present — 0 url(s)
social
no OpenGraph or Twitter meta tags found