Ep201 - ‘How Google Search Crawls Pages’

Episode 201 contains the Digital Marketing News and Updates from the week of Feb 26 - Mar 1, 2024.

1. ‘How Google Search Crawls Pages’ - In a comprehensive video from, Google engineer Gary Illyes sheds light on how Google's search engine discovers and fetches web pages through a process known as crawling.  

Crawling is the first step in making a webpage searchable. Google uses automated programs, known as crawlers, to find new or updated pages. The cornerstone of this process is URL discovery, where Google identifies new pages by following links from known pages. This method highlights the importance of having a well-structured website with effective internal linking, ensuring that Google can discover and index new content efficiently.

A key tool in enhancing your website's discoverability is the use of sitemaps. These are XML files that list your site's URLs along with additional metadata. While not mandatory, sitemaps are highly recommended as they significantly aid Google and other search engines in finding your content. For business owners, this means working with your website provider or developer to ensure your site automatically generates sitemap files, saving you time and reducing the risk of errors.

Googlebot, Google's main crawler, uses algorithms to decide which sites to crawl, how often, and how many pages to fetch. This process is delicately balanced to avoid overloading your website, with the speed of crawling adjusted based on your site's response times, content quality, and server health. It's crucial for businesses to maintain a responsive and high-quality website to facilitate efficient crawling.

Moreover, Googlebot only indexes publicly accessible URLs, emphasizing the need for businesses to ensure their most important content is not hidden behind login pages. The crawling process concludes with downloading and rendering the pages, allowing Google to see and index dynamic content loaded via JavaScript.


2. Is Google Happy with 301+410 Responses? - In a recent discussion on Reddit, a user expressed concerns about their site's "crawl budget" being impacted by a combination of 301 redirects and 410 error responses. This situation involved redirecting non-secure, outdated URLs to their secure counterparts, only to serve a 410 error indicating the page is permanently removed. The user wondered if this approach was hindering Googlebot's efficiency and contributing to crawl budget issues.

Google's John Mueller provided clarity, stating that using a mix of 301 redirects (which guide users from HTTP to HTTPS versions of a site) followed by 410 errors is acceptable. Mueller emphasized that crawl budget concerns primarily affect very large sites, as detailed in Google's documentation. If a smaller site experiences crawl issues, it likely stems from Google's assessment of the site's value rather than technical problems. This suggests the need for content evaluation to enhance its appeal to Googlebot.

Mueller's insights reveal a critical aspect of SEO; the creation of valuable content. He criticizes common SEO strategies that replicate existing content, which fails to add value or originality. This approach, likened to producing more "Zeros" rather than unique "Ones," implies that merely duplicating what's already available does not improve a site's worth in Google's eyes.

For business owners, this discussion underlines the importance of focusing on original, high-quality content over technical SEO manipulations. While ensuring your site is technically sound is necessary, the real competitive edge lies in offering something unique and valuable to your audience. This not only aids in standing out in search results but also aligns with Google's preference for indexing content that provides new information or perspectives.

In summary, while understanding the technicalities of SEO, such as crawl budgets and redirects, is important, the emphasis should be on content quality. Businesses should strive to create original content that answers unmet needs or provides fresh insights. This approach not only helps with better indexing by Google but also engages your audience more effectively, driving organic traffic and contributing to your site's long-term success.


3. UTM Parameters & SEO - Google's John Mueller emphasized that disallowing URLs with UTM parameters does not significantly enhance a website's search performance. Instead, he advocates for maintaining clean and consistent internal URLs to ensure optimal site hygiene and efficiency in tracking.

Mueller's advice is straightforward: focus on improving the site's structure to minimize the need for Google to crawl irrelevant URLs. This involves refining internal linking strategies, employing rel-canonical tags judiciously, and ensuring consistency in URLs across feeds. The goal is to streamline site management and make it easier to track user interactions and traffic sources without compromising on SEO performance.

A notable point Mueller makes is regarding the handling of external links with UTM parameters. He advises against blocking these through robots.txt, suggesting that rel-canonical tags will effectively manage these over time, aligning external links with the site's canonical URL structure. This approach not only simplifies the cleanup of random parameter URLs but also reinforces the importance of direct management at the source. For instance, if a site generates random parameter URLs internally or through feed submissions, the priority should be to address these issues directly rather than relying on robots.txt to block them.

In summary, Mueller's guidance underscores the importance of website hygiene and the strategic use of SEO tools like rel-canonical tags to manage URL parameters effectively. His stance is clear: maintaining a clean website is crucial, but blocking external URLs with random parameters is not recommended. This advice aligns with Mueller's consistent approach to SEO best practices, emphasizing the need for site owners to focus on foundational site improvements and efficient management of URL parameters for better search visibility and tracking.


4. Transition Required for Google Business Profile Websites - Google has announced that starting in March 2024, websites created through Google Business Profiles (GBP) will be deactivated, with an automatic redirect to the businesses' Google Business Profile in place until June 10, 2024. This move requires immediate attention from GBP website owners to ensure continuity in their online operations.

For businesses unsure if their website is hosted through Google Business Profiles, a simple search on Google for their business name and accessing the edit function of their Google Business Profile will reveal if their website is a GBP creation. It’s indicated by a message stating, “You have a website created with Google.” For those without a GBP website, the option to link an external site will be available.

In response to this change, Google has recommended several alternative website builders for affected businesses. Among the suggested platforms are Wix, Squarespace, GoDaddy, Google Sites, Shopify (specifically for e-commerce), Durable, Weebly, Strikingly, and WordPress. Each offers unique features, with WordPress notable for its free website builder incorporating generative AI capabilities. However, users should be aware that content on WordPress may be used as training data for OpenAI and Midjourney unless they opt out.

Performance-wise, WordPress, Wix, and Squarespace have shown substantial improvements in Core Web Vitals, indicating enhanced site performance. However, businesses focused on search engine optimization (SEO) should note that Google Sites, despite being a recommended option, may not offer the best SEO capabilities.

Additionally, Google Ads campaigns linked to GBP websites must update their website links by March 1, 2024, to avoid disruption. This deadline emphasizes the importance of promptly selecting a new website builder or hosting service to maintain online visibility and functionality.


5. New Carousel Rich Result for Enhanced Local Discovery - Google has introduced a carousel rich result feature, currently in beta, designed to enrich search experiences for users by displaying lists of local businesses, products, and events in a horizontally scrolling carousel format. This flexible feature allows for the combination of various items such as hotels, restaurants, and events into a single, visually appealing list, ideal for showcasing top activities in a city.

The carousel is applicable to a range of structured data types, specifically targeting LocalBusiness subtypes (e.g., Restaurant, Hotel, VacationRental), Products, and Events. This inclusivity extends to subcategories under LocalBusiness, like LodgingBusiness, following the Schema.org hierarchy.

To be eligible for this rich result, publishers must utilize ItemList structured data, ensuring that all information presented is visible on the webpage. The structured data must have ItemList as the top-level container, with each URL within the list pointing to distinct pages on the same domain. This requirement emphasizes the need for transparency and accuracy in the data provided.

A unique aspect of this update is the ability to mix and match different entity types within the carousel. For example, a webpage about "Things To Do In Paris" could feature structured data for events and local businesses, including landmarks and dining options, thus offering a comprehensive view of the city's offerings.

For businesses and events, this structured data format includes details such as images, ratings, prices, and review counts, aligning with Google's recommendations for specificity. This approach not only enhances the user's search experience but also provides local businesses and event organizers a new avenue to showcase their offerings more dynamically.

However, employing this structured data does not guarantee rich result display; it merely makes the content eligible. As this feature is still in beta, its testing phase implies ongoing evaluations and potential adjustments based on feedback and performance.

In conclusion, Google's new carousel rich result represents a promising opportunity for digital marketers, local businesses, and publishers to elevate their visibility on search engines. By adhering to Google's structured data guidelines, entities can enhance their chances of being featured in these rich, interactive search results, potentially driving more engagement and visits to their webpages.


6. Google's Excuse on Performance Max Campaign Transparency - Google has clarified its approach to sharing performance data for Performance Max (PMax) campaigns, sparking discussions among digital marketers. The company stated that it intentionally avoids providing channel-specific Key Performance Indicators (KPIs) to prevent misinformation. This decision has raised concerns about transparency, with some advertisers suspecting that Google's reticence is tied to its emphasis on automation.

A Google spokesperson, responding to inquiries in the Google Ads Help Center, explained that examining aggregate Return on Ad Spend (ROAS) or Cost Per Acquisition (CPA) for a single channel within PMax campaigns could lead to misleading conclusions. The reason being, the performance of one channel over another does not consider the incremental cost of acquiring an additional conversion through that channel. The spokesperson emphasized, "One channel may seem better than another with stronger ROI on average. However, this doesn’t account for the marginal cost of the next conversion on that channel. The ‘best’ channel in one auction isn’t the best option in another auction." This explanation underscores Google's rationale for its real-time decision-making process in ad placements, aimed at securing the most cost-efficient, high-ROI conversions.

This stance by Google underscores a broader debate within the digital marketing community regarding the balance between automated decision-making and the need for transparency. Advertisers advocate for greater insight into campaign performance metrics to ensure their marketing strategies align with their brand's interests, rather than solely serving Google's objectives. The tension between the effectiveness of AI-driven advertising solutions and the desire for more granular performance data reflects ongoing concerns about the alignment of Google's products with advertisers' goals.


7. Advertiser Control in Google's Search Partner Network - Google is providing advertiser with enhanced control and transparency regarding ad placements within the Search Partner Network. This move aims to address concerns raised by advertisers about the placement of their ads on non-Google websites, some of which were reported to contain inappropriate content.

The update introduces impression-level placement reporting for sites within the Search Partner Network, allowing advertisers to see exactly where their ads are being displayed. This level of detail was previously unavailable, marking a step forward in Google's efforts to provide transparency.

Furthermore, Google is implementing a change that allows exclusions of certain ad placements to be applied not only within its own platforms, such as YouTube and display ads but also across the entire Search Partner Network. This means that if an advertiser chooses to exclude an ad placement at the account level, this exclusion will now extend across a wider range of sites, ensuring more comprehensive control over where ads appear.

This update comes in the wake of criticism from Adalytics, a crowd-sourced advertising performance optimization platform. Adalytics had accused Google of placing search ads on inappropriate websites through the Search Partner Network, including sites with pornographic, sanctioned, and pirated content. Google refuted these claims, highlighting inaccuracies in Adalytics' reports. Before these allegations, advertisers could not opt out of the Search Partner Network for Pmax campaigns, and opting out was only optional for other types of campaigns.

In response to the controversy, Google had temporarily allowed Pmax users the option to opt out of the Search Partner Network until March 1. With the removal of this temporary measure, Google's latest update aims to empower advertisers with greater insights and control, ensuring their ability to safeguard their brand's reputation by avoiding placement near inappropriate content.

The Search Partner Network includes a variety of websites and apps that collaborate with Google to show search ads, extending the reach of these ads beyond traditional search platforms to include major properties like YouTube and Google Discover, as well as other sites not directly associated with search activities. Google maintains that including campaigns in the Search Partner Network can significantly improve clicks and conversions, offering advertisers the opportunity to engage customers on a broader array of sites.


8. Google Ads' New Policy on Pausing Low-Activity Ad Groups - Google Ads has introduced a significant update aimed at enhancing budget efficiency for advertisers. Starting from March 11, the platform will implement an automated process to pause ad groups that have shown low activity. Specifically, ad groups created at least 13 months prior and which have not generated any impressions in the past 13 months will be targeted. This initiative is part of Google's efforts to help advertisers allocate their budgets more effectively by eliminating underperforming ad groups from their campaigns.

The implementation of this policy is scheduled to be completed by April 30, following a rollout period of just over seven weeks. This will encompass all production Google Ads accounts, ensuring a comprehensive application of the new rule. Advertisers affected by this change will receive notifications, and they will have the option to reactivate their paused ad groups. However, Google recommends a thorough review of these ad groups before reactivation. Advertisers are advised to only unpause ad groups that are anticipated to generate impressions in the near future. It's important to note that if these reactivated ad groups fail to secure any impressions within three months, they will be automatically paused once again by Google.

Moreover, Google has confirmed that advertisers will retain the ability to update and modify their ad groups even while they are in a paused state. This flexibility allows advertisers to make necessary adjustments to their ad groups to potentially improve performance before deciding to unpause them.

This update is a strategic move by Google to ensure that advertisers' budgets are not wasted on ad groups that do not contribute to their campaign's success. By encouraging advertisers to focus on ad groups that are more likely to perform well, Google aims to improve the overall efficiency and effectiveness of advertising campaigns on its platform. Advertisers are encouraged to review their ad groups in light of this new policy to make informed decisions about which ad groups to maintain active and which to revise or eliminate.