HTTPS surface reachable (robots ✓, sitemap ✗, title ✓)
Why it matters: Public files — robots.txt, sitemap.xml, head meta — are what attackers see first during reconnaissance. Misadvertised paths, stale sitemaps, and verbose generators leak more than intended (ISO 27001 A.8.9).
robots.txt
present
# Cambria robots
User-agent: grapeshot
Disallow: /member
Disallow: /*?*err_code=404
Disallow: /search
Disallow: /search/?*
User-agent: *
Disallow: /*?*page=
Disallow: /member
Disallow: /*?*err_code=404
Disallow: /search
Disallow: /search/?*
Disallow: /mapi/v4/*/user/*
Disallow: /embed
Disallow: /*/webview
Disallow: /api
User-agent: Amazonbot
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: Applebot-Extended
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: cohere-ai
Disallow: /
User-agent: Diffbot
Disallow: /
User-agent: DuckAssistBot
Disallow: /
User-agent: FacebookBot
Disallow: /
User-agent: GPTBot
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: Googlebot
Allow: /
Disallow: /*?*err_code=404
Disallow: /search
Disallow: /search/?*
User-agent: magpie-crawler
Disallow: /
User-agent: meta-externalagent
Disallow: /
User-agent: Meta-ExternalFetcher
Disallow: /
User-agent: OAI-SearchBot
Disallow: /
User-agent: Perplexity-User
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: Timpibot
Disallow: /
User-agent: TurnitinBot
Disallow: /
User-agent: Ai2Bot-Dolma
Disallow: /
User-agent: Claude-SearchBot
Disallow: /
User-agent: Claude-User
Disallow: /
User-agent: omgilibot
Disallow: /
User-agent: TtoYouBot
Disallow: /
User-agent: WebGPTBot
Disallow: /
User-agent: Scope3/2.0 (scope3.com)
Allow: /$
Allow: /*
Disallow: /*?*page=
Disallow: /member
Disallow: /*?*err_code=404
Disallow: /search
Disallow: /search/?*
Disallow: /mapi/v4/*/user/*
Disallow: /embed
Disallow: /api
User-agent: AmazonAdBot
Allow: /$
Allow: /*
Disallow: /*?*page=
Disallow: /member
Disallow: /*?*err_code=404
Disallow: /search
Disallow: /search/?*
Disallow: /mapi/v4/*/user/*
Disallow: /embed
Disallow: /api
# archives
Sitemap: https://www.huffpost.com/static-assets/isolated/huffpostsitemapgeneratorjob-prod-public/us/sitemaps/sitemap-v1.xml
Sitemap: https://www.huffpost.com/static-assets/isolated/huffpostsitemapgeneratorjob-prod-public/us/sitemaps/sitemap-google-news.xml
Sitemap: https://www.huffpost.com/static-assets/isolated/huffpostsitemapgeneratorjob-prod-public/us/sitemaps/sitemap-google-video.xml
Sitemap: https://www.huffpost.com/static-assets/isolated/huffpostsitemapgeneratorjob-prod-public/us/sitemaps/sections.xml
Sitemap: https://www.huffpost.com/static-assets/isolated/huffpostsitemapgeneratorjob-prod-public/us/sitemaps/sitemap-top-sections.xml
head
- title
- HuffPost - Breaking News, Politics, Entertainment & Opinion
- description
- Read the latest U.S. and world news, politics, entertainment, lifestyle and opinion pieces from HuffPost’s trusted team of journalists.
social
- og:site_name
- HuffPost
- og:type
- website
- og:title
- HuffPost - Breaking News, U.S. and World News
- og:url
- https://www.huffpost.com/
- og:description
- Read the latest U.S. and world news, politics, entertainment, lifestyle and opinion pieces from HuffPost’s trusted team of journalists.
- og:image
- https://img.huffingtonpost.com/asset/6876d9b316000014e542cf17.jpg
- og:image:url
- https://img.huffingtonpost.com/asset/6876d9b316000014e542cf17.jpg
- og:app_id
- 46744042133
- twitter:description
- Read the latest U.S. and world news, politics, entertainment, lifestyle and opinion pieces from HuffPost’s trusted team of journalists.
- twitter:title
- HuffPost - Breaking News, U.S. and World News
- twitter:site
- @HuffPost
- twitter:image
- https://img.huffingtonpost.com/asset/6876d9b316000014e542cf17.jpg