Technical SEO for Lawyers | Site Performance and Indexation

Your content can be excellent and your firm still invisible. When Google cannot crawl, render, or index an attorney website, none of your lawyer SEO work gets a chance to pay off. This page stays on one layer and goes deep on it: the technical infrastructure that decides whether search engines and AI answer engines can reach your pages, understand them, and trust them enough to show them.

This is not a content guide and not a link guide. It is the plumbing under everything else, written for law firm sites specifically, because attorney websites fail in patterns that generic technical SEO advice never mentions.

TL;DR for Busy Attorneys

The three checks worth running this week:

Read your robots.txt out loud. A staging Disallow: / that survives a redesign takes the whole site out of Google, and the symptom looks like a penalty.
Test a practice area page with Search Console's live URL tool. If your case-type copy is missing from the rendered HTML, your content is fine and your rendering is the problem.
Validate your schema against the visible page. Mismatched address or phone in the markup quietly suppresses the office it describes for a full quarter before anyone notices.

What technical SEO actually covers for a law firm#

Technical SEO is the set of site and server conditions that let a search engine discover a page, fetch it, render it the way a browser does, store it in the index, and pull a trustworthy answer out of it. It sits apart from the words on the page and apart from who links to you. You can have the best car accident page in your market and still lose to a weaker firm because their site is technically clean and yours is not.

For an attorney website the technical surface area is larger than most firms expect. A single-location firm with eight practice areas and a few attorney bios is already past 30 indexable URLs. A personal injury firm running location pages across 15 cities, practice area pages for 9 case types, attorney bios, and a blog is past 200 URLs before anyone counts. Every one of those pages depends on the same plumbing. When the plumbing leaks, it does not leak on one page. It leaks everywhere at once.

Google also applies stricter scrutiny to legal content. Pages that affect someone's legal rights, money, or safety get evaluated against a higher quality bar, and the technical signals feed that judgment. A site that loads slowly, serves a security warning, or shows Google a half-empty page is sending a trust signal before a human reads a word.

The rest of this page is the inventory: every technical area that breaks on law firm sites, what the failure looks like, and how you fix it.

Crawlability: can a search engine reach every page#

Crawlability is the first gate. If a crawler cannot get to a page, nothing downstream matters. On attorney websites the failures cluster in a handful of predictable places.

robots.txt mistakes that take a firm offline#

Your robots.txt file sits at yourfirm.com/robots.txt and tells crawlers where they may and may not go. It is also the single most destructive file on the site, because one wrong line disables everything.

The classic version: a firm rebuilds the site, the new build ships with the staging server's robots.txt, and that file still says:

User-agent: *
Disallow: /

That blocks the entire site. Organic traffic falls off two weeks later, the firm assumes Google applied a penalty, and the real cause is a single line nobody read at launch. A close cousin is a directory-level block that survives a migration, something like Disallow: /practice-areas/ left in from a staging exclusion, which silently removes every case-type page from search while the homepage keeps ranking and hides the problem.

A correct baseline for a law firm site looks closer to this:

User-agent: *
Allow: /
Disallow: /wp-admin/
Disallow: /thank-you/
Disallow: /*?fbclid=
Sitemap: https://www.yourfirm.com/sitemap.xml

You allow the site, block the genuinely useless paths (admin, conversion thank-you pages, tracking-parameter URLs), and point crawlers at the sitemap.

JurisPage Tip

The single most expensive mistake we see after a firm changes site builders or hosts is a staging Disallow: / that nobody removed on launch day. Before you touch anything else, open the live robots.txt in a browser and read it. If it disallows the root or your practice area folder, that is the entire problem, and fixing it recovers traffic faster than any content project ever will.

Crawl budget on multi-location firms#

Crawl budget is how many URLs a crawler will fetch on your site in a given window. For a small firm under a few hundred pages this rarely matters. For a firm generating location and practice-area combinations it matters a lot, because the site can manufacture thousands of low-value URLs that burn the budget before the pages that earn cases get refreshed.

The usual sources of waste on attorney sites are calendar archives, tag and author archives from a blog, internal search result pages that get indexed, and parameter URLs from filters or tracking. Every one of those is a URL a crawler spends a fetch on instead of spending it on your motorcycle accident page.

You control this three ways. You keep low-value templates out of the index with a noindex directive and out of crawl paths by not linking to them. You consolidate parameter URLs to a single canonical form. And you keep the internal link graph tight so the pages that matter are reachable in a few clicks from the homepage while the junk is not reachable at all.

Crawl traps and orphan pages#

Two opposite failures show up on the same sites. A crawl trap is a place where the crawler can generate infinite URLs, usually a faceted filter on a "find an attorney" or blog index that produces ?sort=, ?page=, ?practice= combinations without limit. An orphan page is the reverse: a real page with no internal links pointing to it, so a crawler can only find it through the sitemap and treats it as unimportant.

Location pages are the most common orphans on law firm sites. A developer builds 15 city pages, links to four of them in the footer, and leaves 11 stranded. You find orphans by crawling the site with a tool like Screaming Frog or Sitebulb, exporting the URL list, and comparing it against the sitemap and the internal link report. Anything in the sitemap with zero inbound internal links is an orphan, and it needs a real link from a relevant parent page.

Rendering: what Google actually receives#

A page can be crawlable and still arrive at the index nearly empty. That happens when the content depends on JavaScript that the crawler does not execute the way your browser does.

Plenty of law firm sites built on heavy page builders or single-page frameworks render practice area copy client side. You open the page and it looks complete. The raw HTML the crawler fetches first is a shell with a loading state, and the detailed content about truck accident liability arrives only after scripts run. Google can render JavaScript, but it does so on a delay and not always completely, so the firm ends up with a page that reads as thin or empty in the index while looking full to every human who visits.

The test takes two minutes. In Google Search Console, run URL Inspection on a practice area page, choose Test Live URL, and view the rendered HTML and screenshot. If your case-type copy, your internal links, and your contact form are present in that rendered output, rendering is fine. If they are missing, the content is not your problem. The delivery is.

The durable fix is to serve the content in the initial HTML response. Server-side rendering or static generation for the pages that earn cases (practice areas, location pages, attorney bios) removes the dependency entirely. If the platform cannot do that, the content for those templates needs to live in the HTML rather than being injected after load.

JurisPage Tip

From our experience auditing firm sites, the rendering problem hides behind good-looking pages, so it survives for years. The tell is a Search Console "Crawled, currently not indexed" status on detailed pages that obviously have plenty of content when you visit them. When the rendered HTML is thin but the live page is rich, you have found it.

Indexation: crawled is not the same as indexed#

Indexation is where the page either enters Google's index or gets excluded. The Pages report in Search Console is the instrument. You compare the number of pages you expect to be indexed against the number that are, then you read the exclusion reasons one by one.

Crawled, currently not indexed#

This status on a stack of similar pages usually means Google fetched them, decided they were near duplicates of each other or too thin to store, and declined. The textbook law firm version is a set of city pages built by find-replacing the city name into one template. Twenty pages that say the same thing with "Phoenix" swapped for "Tucson" do not get twenty rankings. They get one or none.

The fix is not more pages. It is fewer pages that are genuinely distinct. A location page earns its index slot when it carries something only that location has: the local courthouse and filing specifics, real case results from that venue, the actual office and staff, jurisdiction-specific procedure. If you cannot make a location page genuinely different, do not publish it. Thin location sprawl drags down the pages around it.

Discovered, currently not indexed#

This one means Google knows the URL exists and has not bothered to crawl it yet. At scale it is a crawl-budget and internal-linking signal. The page is not important enough in the site's link graph for Google to prioritize. The answer is a stronger internal link from a relevant, already-indexed page, and fewer low-value URLs competing for the same crawl attention.

noindex and canonical mistakes#

Two tags quietly cause most of the self-inflicted indexation damage on law firm sites.

A noindex left on a template after launch keeps an entire section out of search for months. It happens constantly when a staging environment is set to noindex globally and the production launch does not strip it from every template. A canonical tag pointing the wrong way is subtler: paginated blog pages or filtered listings that all canonicalize back to page one tell Google to ignore everything past the first page, and a location page that canonicalizes to the homepage removes itself.

Here is the hygiene rule. Every indexable page should have a self-referencing canonical that points to its own clean URL. Pages you do not want indexed should carry a noindex and should not be in the sitemap. Those two states should never contradict each other, and a page should never both be in the sitemap and carry a noindex.

JurisPage Tip

The fastest ranking recovery we ever deliver is almost never a content project. It is finding a leftover noindex or a bad canonical that has been suppressing a page that should be earning cases. Those pages often come back within days of the fix, because Google already knows them and only needed permission to show them again.

Index bloat and soft 404s#

Index bloat is the opposite of thin indexation: too many low-value URLs in the index diluting the site's perceived quality. Tag archives, attachment pages, internal search results, and old campaign landing pages are the usual offenders on attorney sites. A soft 404 is a page that returns a 200 status while showing a "nothing here" experience, which confuses Google about what is real. Empty location pages for cities a firm no longer serves, or a practice area that was removed but still resolves to a stub, both qualify. Decide per URL: make it genuinely useful, return a real 404 or 410 if it should be gone, or 301 it to the closest relevant page.

XML sitemaps and status-code hygiene#

Your XML sitemap is the map you hand to search engines. On a multi-practice, multi-location firm it should be structured, accurate, and honest.

Include every indexable URL: practice areas, locations, attorney bios, the money pages, and real blog content. Exclude everything you do not want indexed: thank-you pages, internal tools, tag archives, parameter URLs. Keep lastmod accurate and do not set every page to today's date, because a sitemap that claims the entire site changed this morning is a sitemap Google learns to distrust. For a large firm site, split the sitemap into logical children, one for practice areas, one for locations, one for attorneys, one for blog, and reference them from a sitemap index. Submit the index in Search Console and watch the submitted-versus-indexed gap per child, because that gap tells you exactly which template has an indexation problem.

Status codes are the other half of this. After a redesign or a CMS migration the URL structure usually changes, and the firm needs a complete old-to-new map with a 301 from every old URL to its closest new equivalent. The common failures are predictable. Redirect chains where an old URL 301s to a second URL that 301s again waste crawl equity and slow everything down, so collapse them to a single hop. Internal links that still point at the old URL and rely on the redirect should be updated to point at the final URL directly. Pages that should be gone should return 404 or 410, not a 200 stub, and not a 302 when the move is permanent.

JurisPage Tip

The mistake we see most often on a relaunch is mapping the homepage and the top five pages, then sending everything else to the homepage with a blanket redirect. Every one of those becomes a soft 404 in Google's eyes, and the firm loses the rankings those deeper pages held. Map every old URL to a real, relevant destination, or let it 410. A blanket redirect to the homepage is a slow leak.

HTTPS and security signals#

HTTPS has been a confirmed ranking signal for over a decade, and in 2026 there is no defensible reason for an attorney website to serve anything over plain HTTP. The ranking weight is the smaller issue. The bigger one is trust. A potential client is about to describe a legal problem, sometimes a criminal or family matter, into your contact form. A browser warning that the connection is not secure ends that visit, and it should.

The failures are consistent across firm sites. Mixed content is the most common: the site loads over HTTPS but an image, a script, or a stylesheet still loads over HTTP, which triggers a warning and can block the resource. Audit for it and update every internal reference to HTTPS or to a protocol-relative form. An expired or misconfigured certificate takes the whole site to a full-page security interstitial, so the certificate needs automatic renewal and monitoring, not a calendar reminder. And the site needs one canonical host: the HTTP and HTTPS variants, and the www and non-www variants, should all 301 to a single chosen version so authority does not split four ways.

Core Web Vitals and the marketing stack that breaks them#

Google measures three Core Web Vitals, and attorney websites tend to fail the same ones for the same reasons.

Metric	What it measures	Good	Needs work	Poor
LCP (Largest Contentful Paint)	How fast the main content paints	Under 2.5s	2.5–4.0s	Over 4.0s
CLS (Cumulative Layout Shift)	How much the layout jumps during load	Under 0.1	0.1–0.25	Over 0.25
INP (Interaction to Next Paint)	How fast the page answers every interaction	Under 200ms	200–500ms	Over 500ms

LCP usually dies on an unoptimized hero image. The full-width courthouse or skyline photo behind your headline is often 3 to 5 MB because nobody compressed it or served it in a modern format. Compress it, serve WebP or AVIF, and give it explicit width and height so it reserves its space. CLS comes from elements that load without reserved space: images with no dimensions, ad or chat containers that push content down when they appear, web fonts that reflow text. Reserve space for anything that loads late.

INP is the one that ambushes law firms, because firms love interactive tools. A settlement calculator, a multi-step intake form, a "do I have a case" quiz: if any of those takes longer than 200 milliseconds to respond to a click, the score drops into the red and stays there. The fix is to break long JavaScript tasks into smaller chunks so the browser can answer the next click between them, and to make sure each step of a multi-step form transitions in under 200 milliseconds.

The weight is rarely your content. It is the marketing stack bolted on top. The recurring offenders on attorney sites:

A live chat widget that loads several hundred kilobytes of JavaScript before your headline paints
Call tracking injected as a blocking script in the document head
Two or three session-recording or heatmap tools running at once because nobody removed the old one when they added the new one
A page builder shipping render-blocking CSS and JavaScript from a dozen plugins

The pattern that fixes most of it is deferral. Nothing that is not needed for the first paint should load before it. A chat widget can wait until the visitor scrolls or moves the mouse, which still loads it for every real prospect while keeping it out of the critical path. Call tracking and analytics can load asynchronously. Heatmap tools can wait for idle time.

// Defer the chat widget until the visitor engages
let loaded = false;
function loadChat() {
  if (loaded) return;
  loaded = true;
  const s = document.createElement('script');
  s.src = 'https://chat-provider.example/widget.js';
  s.async = true;
  document.body.appendChild(s);
}
window.addEventListener('scroll', loadChat, { once: true });
window.addEventListener('mousemove', loadChat, { once: true });

JurisPage Tip

At JurisPage we defer the chat widget until the first scroll or mouse move on every site we run. The widget still appears for every genuine prospect, just not before the headline and phone number. On a typical firm site that one change moves LCP out of the red without editing a single word of copy, and it usually improves the conversion rate too because the page becomes usable sooner.

The HTML size and DOM reality#

Googlebot is efficient about how much raw HTML it processes per page, and law firm "ultimate guide" pages and plugin-heavy practice area pages are exactly the kind that get bloated past the point where the crawler reliably reads the bottom. When inline scripts, inline styles, and thousands of unnecessary DOM nodes inflate the file, the content that sits low on the page (your contact form, your internal links, your case results, your FAQ) can fall past where the crawler is reading.

Keep scripts and styles in external files instead of inlining them. Keep the DOM lean: every unnecessary wrapper div, every hidden mobile menu duplicated in the markup, every unused component from the page builder adds nodes. Open Chrome DevTools, watch the document size in the Network tab, and run Lighthouse to see the DOM node count. If a practice area page is approaching one and a half megabytes of raw HTML or several thousand DOM nodes, the bottom of that page is at risk.

Test the homepage and one practice area page in PageSpeed Insights and read the field data, the real-user numbers, not only the lab score. The lab score is a starting point. The field data is what Google acts on.

Mobile-first and page experience#

Google indexes the mobile version of your site as the primary version. If a page looks right on a desktop and breaks on a phone, the broken phone version is the one that gets indexed and ranked. Legal searches skew heavily to mobile, because the person searching "dui attorney near me" at midnight is holding a phone, so this is not a rounding error.

The recurring mobile failures on attorney sites:

Tap targets too small or too close together, especially the click-to-call button and the navigation, which Google flags below roughly 48 by 48 pixels or under 8 pixels apart
Body text under 16 pixels, forcing pinch and zoom on the exact content a prospect is trying to read
Horizontal scrolling from fixed-width tables, oversized images with no max-width, or a desktop layout that does not collapse
Content parity failures, where the mobile template hides sections the desktop version shows, which means Google never indexes the hidden content at all
Intrusive full-screen interstitials that cover the content on mobile, which can suppress the page

Content parity is the one firms miss most. If your mobile template collapses the practice area detail behind a tab that loads empty, or drops the FAQ entirely on small screens, that content does not exist as far as mobile-first indexing is concerned. Check the rendered mobile HTML, not just the desktop view.

Structured data: making the firm machine-readable#

Structured data is how you state, in a format search engines and AI answer engines parse directly, that this is a law firm, these are its attorneys, this is the office, these are the services and locations. Done correctly it makes your firm legible to Google and to the AI systems that increasingly answer legal questions before a user ever clicks.

Use LegalService for the firm rather than a generic LocalBusiness. Put Attorney schema on each bio page and connect it with sameAs to the official state bar profile and other authoritative listings, so the credential is verifiable rather than merely asserted. Use FAQPage only for questions that genuinely appear on the page, because marking up answers a visitor cannot see is the kind of thing search engines strip and distrust. Use BreadcrumbList that matches the visible breadcrumb. For a firm with more than one office, give each location page its own markup with that office's unique address, phone, and coordinates, and connect them to the parent firm entity rather than copying one office's details everywhere.

Here is an attorney bio example with the credential connections that matter:

{
  "@context": "https://schema.org",
  "@type": "Attorney",
  "name": "Jane Adams",
  "jobTitle": "Partner",
  "url": "https://www.yourfirm.com/attorneys/jane-adams/",
  "worksFor": {
    "@type": "LegalService",
    "name": "Adams & Reed",
    "url": "https://www.yourfirm.com"
  },
  "knowsAbout": ["Personal Injury", "Wrongful Death", "Premises Liability"],
  "sameAs": [
    "https://www.statebar.example/profile/jane-adams-123456",
    "https://www.linkedin.com/in/jane-adams-attorney"
  ]
}

The error we find most is not missing schema. It is schema that disagrees with the page. The address in the markup does not match the address in the footer. The phone number was updated on the site and never in the JSON-LD. The firm name in the schema is the old name from before a merger. Mismatched structured data is worse than none, because you are handing a machine confident, wrong facts about your firm.

JurisPage Tip

Run every template through Google's Rich Results Test before launch, not after. We have seen one office with a transposed postal code in the markup quietly undercut visibility for that location for an entire quarter, because the page looked perfect to every human who opened it and only the machine saw the wrong number.

Validate it, then keep validating it. Schema breaks the same way robots.txt breaks: a developer ships an unrelated change, the markup falls out of sync with reality, and the regression hides until something surfaces it. Treat structured data as code that needs a test, not as a one-time setup.

Multi-location and multi-office technical hygiene#

A firm with more than one office multiplies every technical concern above, and adds a few of its own. This is purely the technical side of multi-location, the crawl, canonical, and markup mechanics, not the off-site work.

Each office gets its own location page with genuinely unique content and its own self-referencing canonical, never a canonical pointing back to a single flagship location, which would collapse them into one. Each location page carries its own structured data with that office's exact address, phone, and coordinates, matched to what the page displays and to what the firm publishes elsewhere, because a mismatch between the schema on the page and the firm's listed details is a contradiction a machine cannot resolve in your favor. If two offices serve overlapping areas, the pages must differ on more than the city name or Google will treat them as the duplicate-template pattern from the indexation section and index one or none.

A firm operating across multiple states also has a jurisdiction nuance: a divorce procedure page that is accurate in one state and wrong in another is not just a content issue, it is a duplicate-and-contradiction issue when the same template is reused across state pages with only the state name changed. Distinct, state-correct pages with clean canonicals avoid both the thin-duplicate trap and the accuracy problem at once.

Managing AI and search crawlers#

The crawler list is longer than it used to be. Beyond Googlebot, AI systems run their own crawlers, and how your site treats them is now a technical decision a firm should make on purpose rather than by default.

Some of these crawlers power answer engines that cite sources and can send referrals. Others collect training data and return nothing. A firm can make a deliberate split in robots.txt: allow the crawlers that cite and can drive visibility, and decide case by case about the ones that only ingest. The point here is not which choice is right for your firm. The point is that the choice is made in a technical file, it has consequences for whether your firm appears in AI answers about legal questions, and most firms have never opened that file to look. Treat AI crawler access as a setting you control, the same way you control robots.txt for traditional crawlers, and revisit it as the answer engines change.

Semantic and accessible HTML#

Two structural details affect how cleanly a machine can extract your pages.

The first is semantic HTML. When a page wraps everything in generic divs, a crawler or an AI system has to guess which part is the answer to a legal question and which part is navigation, sidebar, or boilerplate. Strict use of article, section, main, nav, aside, header, and footer makes that distinction explicit, so the system can isolate the part of your premises liability page that actually answers the question from the attorney-bio sidebar next to it. View source on a key page and search for article and main. If the page is entirely divs, you are making extraction harder than it needs to be.

The second is accessibility, which overlaps with technical SEO more than most firms expect because both reward the same things: correct heading order, descriptive labels, content that works without a mouse. A correct heading hierarchy with one h1 and no skipped levels, descriptive alt text on attorney headshots and infographics, visible focus states on every interactive element, an aria-label on every click-to-call and form button, and adequate color contrast all serve screen-reader users and give crawlers cleaner structure at the same time. The work you do for accessibility is not separate from technical SEO. It is part of it.

A monitoring cadence that catches regressions#

Technical SEO is not a one-time cleanup. It is a system that breaks every time someone deploys, swaps a plugin, or relaunches, so the real discipline is catching the regression before it costs a quarter of rankings.

A workable cadence for a firm site looks like this. Weekly, you scan Search Console for new crawl errors, indexation drops, and Core Web Vitals changes, because Search Console is the early-warning system and it is free. Monthly, you run a full crawl with a tool like Screaming Frog or Sitebulb and diff it against last month: new orphans, new redirect chains, new noindex tags, new 404s, schema that stopped validating. Quarterly, you do a deeper pass that includes server log analysis, where you watch what Googlebot actually fetches and how often, which is the only place you see crawl-budget waste with certainty rather than inference. And every single time the site changes in a meaningful way, a redesign, a migration, a platform change, a bulk content move, you run a focused technical audit immediately after launch, because that is when robots.txt, noindex, canonicals, and redirects break, all at once and all silently.

JurisPage Tip

The pattern we see across firm after firm is that the damage does not happen during the audit. It happens three weeks later when a developer ships an unrelated change and nobody re-checks the technical baseline. Put a post-deploy technical check on the calendar as a standing item, not a reaction. The firms that do this stop having mystery traffic drops.

The technical check you can run this week#

Interactive checklist: Click each box as you confirm it on your firm's site. Work top to bottom, because the first items decide whether you appear at all and the later ones decide how well.

robots.txt does not disallow the root, the practice area folder, or the blog, and it references the sitemap
Search Console "Test Live URL" shows your practice area copy, internal links, and contact form in the rendered HTML
No production page carries a leftover noindex, and every indexable page has a self-referencing canonical
Search Console Pages report shows no large gap between submitted and indexed, and you have read the top exclusion reasons
Every old URL from the last redesign 301s to a relevant destination, with no redirect chains and no blanket redirect to the homepage
HTTPS is enforced site-wide, there is no mixed content, and www and non-www resolve to one canonical host
LCP is under 2.5s and INP is under 200ms on the homepage and one practice area page, measured on field data
The chat widget, call tracking, and heatmap scripts are deferred, not blocking the first paint
The mobile rendered HTML contains everything the desktop version does, with no hidden sections
LegalService and Attorney schema validate, and the address and phone in the markup match the visible page
A full crawl shows no orphan pages among your money pages and no soft 404s

Where this fits#

Technical SEO is not the part of the work that wins awards or signs the client in the conference room. It is the part that decides whether the rest of the work counts at all. A crawler that cannot reach your page, a renderer that receives an empty shell, an index that excludes your best content, a structured-data block that contradicts the page: each of those quietly cancels effort spent everywhere else.

Get the plumbing right and it stays right with maintenance, not heroics. Every page you publish, every review you earn, every citation you build after that has something solid underneath it instead of a slow leak. That is the whole job of the technical layer. It does not get you noticed. It makes sure that when you do everything else right, it actually lands.

Technical SEO for Lawyers | Site Performance and Indexation

Book Your Strategy Session