Content Pruning

“From our point of view, our quality algorithms do look at the website overall, so they do look at everything that’s indexed.”
– John Mueller

Lately, the following is the first strategy and tactic I implement for SEO on a new client:

  1. Running and saving a Screaming Frog crawl respecting Canonical, Next/Prev, and noindex. If the domain’s robots.txt file contained unfamiliar entries, I would run an additional crawl ignoring it.
  2. Integrating the web property and versions into Google Search Console, optimizing settings, and linking the property with their Google Analytics account.
  3. Taking a look at analytics data, and mapping the timeline against known and suspected Google Algorithm updates.
  4. Adding the domain into software such as ahrefs. And then, exporting reports on organic keywords and pages that have referring domains.
  5. Combining analysis of the web crawl and the SEO software reports to begin my ultimate goal:
  6. Pruning the website of the cruft.

I say this because, the overall trend I’ve noticed in Analytics over the last two years is stagnate traffic, if not gradually declining. And I also say this because I’ve noticed the trend, or rather the epidemic, of websites that continue to ignore consideration of which of their pages are indexed, and what this is saying to Google about the quality of their overall site.


My first things first of SEO is using data in tandem with evaluating a website from a user’s experience: Does this page serve me well, but could be better? These pages can be earmarked for content addition. Does this page serve as a utility, versus an authoritative landing page? Better to noindex. WTF did I just land on? 404, but check exported data first.

A final example would be analyzing subsections of pages, like search results. I would ensure the dynamically generated URLs also have their canon set to the parent page. Between Request indexing and Remove URLs, Google Search Console is my best friend in content pruning.

“The purpose of a content audit for SEO is to improve the perceived trust and quality of a domain, while optimizing crawl budget and the flow of PageRank (PR) and other ranking signals throughout the site.

By removing low-quality content from the index (pruning) and improving some of the content remaining in the index, the likelihood that someone arrives on your site through organic search and has a poor user experience (indicated to Google in a variety of ways) is lowered. Thus, the quality of the domain improves.”2

Just like a car always feels better about itself after a wash, so to does a website after pruning the cruft.