Back to Journal
storefronten

SEO + i18n: canonical URLs on a multilingual Storefront

Per-locale URLs, locale-specific canonical, sitemap-per-locale, plus the 12,000-URL Google Search Console lesson on pagination.

The Netorigo Storefront typically serves three locales: Hungarian, English, Chinese. Search-engine indexing here is not trivial: HU-EN is fast, ZH is a different math entirely. Below is the canonical URL strategy, the hreflang config, plus the Google Search Console lesson that taught us about pagination canonicals.

Per-locale URL shapes

The URL structures do not share a path. The Hungarian product page is /termekek/asus-zenbook-14, the English /products/asus-zenbook-14, the Chinese /cn/products/asus-zenbook-14. Deliberate: Google treats slug localisation as a ranking signal (Hungarian "feltöltő" is not the same query as English "charger").

The Next.js dynamic route is /[locale]/[section]/[slug], where section translates from a per-locale dictionary ({ hu: 'termekek', en: 'products', zh: 'products' }). getStaticPaths produces the correct path directly — no runtime rewrite layer.

hreflang headers

Every HTML response's head carries the full locale table:

<link rel="alternate" hreflang="hu" href="https://example.com/termekek/asus-zenbook-14" />
<link rel="alternate" hreflang="en" href="https://example.com/products/asus-zenbook-14" />
<link rel="alternate" hreflang="zh" href="https://example.com/cn/products/asus-zenbook-14" />
<link rel="alternate" hreflang="x-default" href="https://example.com/products/asus-zenbook-14" />

x-default is EN — Google falls back to it when the visitor's language is undecidable. Some Hungarian partners want HU as the default, but Google policy says x-default should mark a world-default, not a region-default.

The canonical rule

The <link rel="canonical"> is ALWAYS LOCALE-SPECIFIC. The HU page's canonical is the HU URL, not the EN URL. Plenty of SEO guides find this controversial — many recommend that secondary-locale pages point their canonical at the primary (EN) URL to consolidate link equity.

We DO NOT do that. The arguments:

  1. The Hungarian product page has Hungarian content, for a Hungarian visitor, for a Hungarian conversion. Why would we ask Google to index the English URL instead?
  2. hreflang already says these are equivalent translations. Overriding canonical is redundant and occasionally harmful.
  3. Google's own "International and multilingual sites" documentation recommends exactly this: locale-specific canonical, hreflang-clustered.

Sitemap per locale

Three XML sitemaps: /sitemap-hu.xml, /sitemap-en.xml, /sitemap-zh.xml. A /sitemap.xml index ties them together. Google can discover the full URL set without parsing an 80,000-URL combined sitemap (Google limit is 50,000 URLs per sitemap).

The Google Search Console lesson

May 2025: a Search Console report flagged "Duplicate, Google chose different canonical" on ~12,000 URLs. The cause: paginated PLPs (e.g., /products?page=2) used a self-canonical (?page=2), not the base (/products). That is NOT a violation of the canonical rule — but Google's own heuristic often elevates ?page=1 as the true canonical, so pages ?page=2..N effectively lose their indexed identity to the base.

Two options:

  1. All self-canonical: what we did, but it cost us roughly 8,000 of the 12,000.
  2. All base canonical: every ?page=N canonicalises to ?page=1. Some teams do this, but then the products on page N can disappear from search.

We picked option 1 over option 2 because the category-level search targets PLPs anyway; the per-page indexed surfaces are redundant.

The ZH case and what we deferred

The Chinese locale is a different math entirely: currency (CNY), address format (with provinces), payment methods (Alipay, WeChat Pay — neither is Stripe-native). We scoped it late 2025 and deferred the full build. ZH PLPs are indexed (organic SEO is fine), but transactions aren't enabled — the cart shows a "coming soon" label on the buy button because the partner SealedSecrets weren't worth procuring without proof of arbitrage.

The lesson: the SEO surface lives BELOW the physical locale stack, meaning search ranking doesn't presume a full commerce stack underneath. Only the UX gave it away, because the buy button degraded to a placeholder label.