HTTPS surface reachable (robots ✓, sitemap ✗, title ✓)
Why it matters: Public files — robots.txt, sitemap.xml, head meta — are what attackers see first during reconnaissance. Misadvertised paths, stale sitemaps, and verbose generators leak more than intended (ISO 27001 A.8.9).
robots.txt
present
#
# robots.txt
#
# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these "robots" where not to go on your site,
# you save bandwidth and server resources.
#
# This file will be ignored unless it is at the root of your host:
# Used: http://example.com/robots.txt
# Ignored: http://example.com/site/robots.txt
#
# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/robotstxt.html
User-agent: Twitterbot
Allow: /sites/default/files/*.jpg
Allow: /sites/default/files/*.png
Allow: /*.jpg
Allow: /*.jpeg
Allow: /*.png
Allow: /*.webp
Allow: /*.svg
User-agent: *
Crawl-delay: 10
# Allow important assets (for proper rendering)
Allow: /core/*.css
Allow: /core/*.js
Allow: /core/*.gif
Allow: /core/*.png
Allow: /core/*.jpg
Allow: /core/*.jpeg
Allow: /core/*.svg
Allow: /themes/*.css
Allow: /themes/*.js
Allow: /themes/*.gif
Allow: /themes/*.png
Allow: /themes/*.jpg
Allow: /themes/*.jpeg
Allow: /themes/*.webp
Allow: /themes/*.svg
# Images
Allow: /misc/*.gif
Allow: /misc/*.jpg
Allow: /misc/*.jpeg
Allow: /misc/*.png
Allow: /modules/*.gif
Allow: /modules/*.jpg
Allow: /modules/*.jpeg
Allow: /modules/*.png
Allow: /profiles/*.gif
Allow: /profiles/*.jpg
Allow: /profiles/*.jpeg
Allow: /profiles/*.png
Allow: /profiles/*.svg
# Block sensitive/system directories
Disallow: /core/
Disallow: /modules/
Disallow: /profiles/
Disallow: /scripts/
# Block admin & user actions
Disallow: /admin/
Disallow: /user/
Disallow: /node/add/
Disallow: /comment/reply/
# IMPORTANT: Do NOT block /node/ (removed from your original)
# Block unnecessary endpoints
Disallow: /api/
Disallow: /service/
Disallow: /jsonapi/
Disallow: /media/oembed
Disallow: /*/media/oembed
# Block private/system files
Disallow: /cron.php
Disallow: /install.php
Disallow: /update.php
Disallow: /upgrade.php
Disallow: /xmlrpc.php
Disallow: /web.config
# Block documentation files
Disallow: /README.md
Disallow: /LICENSE.txt
Disallow: /CHANGELOG.txt
Disallow: /INSTALL.txt
Disallow: /UPGRADE.txt
Disallow: /MAINTAINERS.txt
# Optional: legacy query-based URLs (safe to keep minimal)
Disallow: /?q=admin/
Disallow: /?q=user/
Disallow: /?q=search/
# Sitemap (update with your actual URL)
#Sitemap: https://www.mygov.in/sitemap.xml
head
- title
- MyGov.in | MyGov: A Platform for Citizen Engagement towards Good Governance in India
- description
- MyGov is an innovative platform that builds a partnership between citizens and the government through technology for the growth.
social
- og:site_name
- MyGov
- og:title
- Homepage | MyGov : The citizen engagement platform of Government of India
- og:description
- MyGov is an innovative platform that builds a partnership between citizens and the government through technology for the growth.
- og:url
- https://www.mygov.in/
- og:type
- website
- og:image
- https://www.mygov.in/themes/custom/mygov_radix/src/assets/images/logo.webp
- twitter:card
- summary_large_image
- twitter:title
- Homepage | MyGov : The citizen engagement platform of Government of India
- twitter:description
- MyGov is an innovative platform that builds a partnership between citizens and the government through technology for the growth.
- twitter:image
- https://www.mygov.in/themes/custom/mygov_radix/src/assets/images/logo.webp
- twitter:site
- @mygovindia