SEO & Discoverability
Built-in tools for search engine optimization
Core Framework Feature
SEO support is built into Harpy.js core. Every new project automatically includes robots.txt and sitemap.xml generation out of the box, making your application discoverable by search engines from day one.
Harpy.js provides a powerful SEO module that automatically generates robots.txt and sitemap.xml files for your application. Built on top of NestJS's dependency injection system, the SEO module offers both default implementations and extensible services for advanced use cases.
Quick Start
For new projects created with the Harpy.js CLI, SEO is automatically configured. For existing projects, add the SeoModule to your app module:
1import { Module } from '@nestjs/common';2import { SeoModule } from '@harpy-js/core';34@Module({5 imports: [6 SeoModule.forRoot({7 baseUrl: process.env.BASE_URL || 'http://localhost:3000',8 }),9 ],10})11export class AppModule {}β‘ Instant Results: Once imported, your application automatically serves /robots.txt and /sitemap.xml endpoints with sensible defaults.
What Gets Generated
robots.txt
The robots.txt file tells search engine crawlers which parts of your site they can access. Here's what the default configuration generates:
1User-agent: *2Allow: /3Disallow: /api/4Disallow: /private/56Sitemap: https://example.com/sitemap.xml78Host: https://example.comsitemap.xml
The sitemap provides search engines with a structured list of your pages, including metadata about update frequency and priority:
1<?xml version="1.0" encoding="UTF-8"?>2<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">3 <url>4 <loc>https://example.com</loc>5 <lastmod>2025-12-13T12:00:00.000Z</lastmod>6 <changefreq>daily</changefreq>7 <priority>1</priority>8 </url>9 <url>10 <loc>https://example.com/about</loc>11 <lastmod>2025-12-13T12:00:00.000Z</lastmod>12 <changefreq>monthly</changefreq>13 <priority>0.8</priority>14 </url>15</urlset>Key Benefits:
- Faster indexing of new and updated pages
- Better crawl budget allocation
- Improved discovery of deep or dynamic pages
- Priority hints for important content
Custom SEO Service
For most applications, you'll want to customize which URLs appear in your sitemap. Create a custom service by extending BaseSeoService:
Step 1: Create Your SEO Service
1import { Injectable } from '@nestjs/common';2import { BaseSeoService, SitemapUrl, RobotsConfig } from '@harpy-js/core';34@Injectable()5export class SeoService extends BaseSeoService {6 getSitemapUrls(): Promise<SitemapUrl[]> {7 const now = new Date();89 return Promise.resolve([10 {11 url: this.baseUrl,12 lastModified: now,13 changeFrequency: 'daily',14 priority: 1.0,15 },16 {17 url: `${this.baseUrl}/about`,18 lastModified: now,19 changeFrequency: 'monthly',20 priority: 0.8,21 },22 ]);23 }2425 getRobotsConfig(): RobotsConfig {26 return {27 rules: {28 userAgent: '*',29 allow: '/',30 disallow: ['/api/', '/admin/'],31 },32 sitemap: `${this.baseUrl}/sitemap.xml`,33 host: this.baseUrl,34 };35 }36}Step 2: Register Your Custom Service
1import { Module } from '@nestjs/common';2import { SeoModule } from '@harpy-js/core';3import { SeoService } from './seo.service';45@Module({6 imports: [7 SeoModule.forRootWithService(SeoService, {8 baseUrl: process.env.BASE_URL || 'http://localhost:3000',9 }),10 ],11})12export class AppModule {}π‘ Protected Access: The baseUrl property is protected in BaseSeoService, giving your custom service access to the configured base URL without manually passing it around.
Dynamic Sitemaps from Database
Real-world applications often need to generate sitemaps from database content like blog posts, products, or user profiles. Here's how to integrate with TypeORM:
1import { Injectable } from '@nestjs/common';2import { BaseSeoService, SitemapUrl } from '@harpy-js/core';3import { InjectRepository } from '@nestjs/typeorm';4import { Repository } from 'typeorm';5import { Post } from './entities/post.entity';67@Injectable()8export class SeoService extends BaseSeoService {9 constructor(10 @InjectRepository(Post)11 private postsRepository: Repository<Post>,12 ) {13 super();14 }1516 async getSitemapUrls(): Promise<SitemapUrl[]> {17 const staticPages: SitemapUrl[] = [18 {19 url: this.baseUrl,20 lastModified: new Date(),21 changeFrequency: 'daily',22 priority: 1.0,23 },24 ];2526 // Fetch blog posts from database27 const posts = await this.postsRepository.find({28 select: ['slug', 'updatedAt'],29 where: { published: true },30 });3132 const dynamicPages = posts.map((post) => ({33 url: `${this.baseUrl}/blog/${post.slug}`,34 lastModified: post.updatedAt,35 changeFrequency: 'weekly' as const,36 priority: 0.7,37 }));3839 return [...staticPages, ...dynamicPages];40 }4142 getRobotsConfig() {43 return {44 rules: { userAgent: '*', allow: '/' },45 sitemap: `${this.baseUrl}/sitemap.xml`,46 };47 }48}β οΈ Performance Tip: For large databases, consider caching sitemap results or implementing pagination with sitemap index files to avoid overwhelming your database with every crawler request.
Multi-language Sitemaps
For internationalized applications, you can use the alternates property to specify alternate language versions of each page:
1getSitemapUrls(): Promise<SitemapUrl[]> {2 const locales = ['en', 'fr', 'es'];3 const pages = ['/', '/about', '/contact'];4 5 const urls: SitemapUrl[] = [];6 7 for (const locale of locales) {8 for (const page of pages) {9 urls.push({10 url: `${this.baseUrl}/${locale}${page}`,11 lastModified: new Date(),12 changeFrequency: 'weekly',13 priority: page === '/' ? 1.0 : 0.8,14 alternates: {15 languages: locales.reduce((acc, lang) => {16 acc[lang] = `${this.baseUrl}/${lang}${page}`;17 return acc;18 }, {} as Record<string, string>),19 },20 });21 }22 }23 24 return Promise.resolve(urls);25}This generates proper <xhtml:link> tags in your sitemap, helping search engines understand the relationship between translated versions of your content.
Advanced robots.txt Configuration
You can define different rules for different crawlers and specify multiple sitemaps:
1getRobotsConfig(): RobotsConfig {2 return {3 rules: [4 {5 userAgent: 'Googlebot',6 allow: '/',7 crawlDelay: 0,8 },9 {10 userAgent: 'Bingbot',11 allow: '/',12 crawlDelay: 1,13 },14 {15 userAgent: '*',16 disallow: ['/api/', '/admin/', '/private/'],17 },18 ],19 sitemap: [20 `${this.baseUrl}/sitemap.xml`,21 `${this.baseUrl}/sitemap-images.xml`,22 ],23 host: this.baseUrl,24 };25}Use Cases:
- Crawl delays: Prevent aggressive crawlers from overwhelming your server
- Multiple sitemaps: Separate sitemaps for different content types (pages, images, videos)
- Bot-specific rules: Customize behavior for Google, Bing, or other specific crawlers
TypeScript Types
The SEO module is fully typed for excellent IDE support:
1interface SitemapUrl {2 url: string;3 lastModified?: Date | string;4 changeFrequency?: 5 | 'always' 6 | 'hourly' 7 | 'daily' 8 | 'weekly' 9 | 'monthly' 10 | 'yearly' 11 | 'never';12 priority?: number; // 0.0 to 1.013 alternates?: {14 languages?: Record<string, string>;15 };16}1718interface RobotsConfig {19 rules: {20 userAgent: string | string[];21 allow?: string | string[];22 disallow?: string | string[];23 crawlDelay?: number;24 } | Array<{...}>;25 sitemap?: string | string[];26 host?: string;27}Testing Your Configuration
Once your application is running, test your SEO endpoints:
1# Test robots.txt2curl http://localhost:3000/robots.txt34# Test sitemap.xml5curl http://localhost:3000/sitemap.xmlπ Validation Tools:
- Google's robots.txt Tester
- Google Search Console - Submit and validate sitemaps
- XML Sitemap Validator
Best Practices
β Update Frequencies
Set realistic changeFrequency values. Use 'daily' for homepage, 'weekly' for blog posts, and 'monthly' for static pages.
β Priority Values
Reserve priority: 1.0 for your most important pages. Use 0.8-0.9 for major sections and 0.5-0.7 for standard content.
β Caching Strategy
The SEO controllers include cache headers by default (24h for robots.txt, 1h for sitemap.xml). Adjust these based on how frequently your content changes.
β Environment Variables
Always use process.env.BASE_URL for your base URL configuration. This ensures correct URLs across development, staging, and production environments.
Why SEO Matters in Modern Web Development
Search engine optimization isn't just about rankingsβit's about making your application discoverable, accessible, and successful:
Organic Traffic
Properly configured sitemaps help search engines discover and index your content faster, leading to increased organic traffic.
Lower Acquisition Costs
Good SEO reduces reliance on paid advertising by improving organic visibility and reducing customer acquisition costs.
Credibility & Trust
Users trust search results. Higher rankings signal authority and build credibility with your audience.
Global Reach
Multi-language sitemap support helps international audiences find your content in their preferred language.
Built for Production from Day One
By including SEO as a core framework feature, Harpy.js ensures that developers don't have to remember to add these critical features later. Your application is search-engine ready from the moment you start development, following industry best practices automatically.