Tracking and AI
As AI becomes increasingly prevalent, tracking solutions are also being affected to a greater extent. The implications range from harmful excessive AI crawling, which pushes servers to the limits of their capacity, to new evaluation options through improved pattern recognition and transfer performance of tracking data by AI models, as well as integrations and interfaces that offer many exciting new possibilities.
AI crawlers distort tracking data
To clarify: GoogleBot, GPTBot, ClaudeBot, AppleBot, and PerplexityBot, etc. together account for 39% of all internet traffic. For every three human visitors, there are two bots. The truth is that GoogleBot and BingBot still account for the majority of crawling requests, but with the increasing number of AI crawlers, the risk of false data in tracking tools is also rising. With the landscape developing as dynamically as it is at present and the speed at which new crawlers are being released onto the internet, it is difficult for tool providers to keep up. Google Analytics uses its own research data and a list from IAB for filtering. Matomo excludes user agents that do not have JavaScript enabled by default. Admins can also exclude individual user agents.
LUX also maintains a list of bots that should be excluded from tracking. Users can expand or customize this list for their own projects as they see fit. This is a major advantage of open source tracking tools, as the source code can be transparently modified and there is a strong community behind the projects that reports new bots. PostHog also maintains a list of excluded bots, and the hard-working community reports new bots, which are then added to the list. Of course, it cannot be ruled out that data may be distorted by crawlers, but responding quickly to such erroneous data is what distinguishes good software. The image below shows the data falsification caused by a weekly crawler from Sistrix, which of course extremely distorts the number of pages visited. However, it is just as important to record whether crawlers can visit the website or whether GoogleBot and Co. encounter obstacles when exploring the information, thereby preventing visibility on Google, ChatGPT, and Bing.
AI & SEO – „The Great Decoupling“
Google's AI Overviews has caused a stir in the SEO community. The decoupling of clicks from impressions in Google Search Console (GSC) views is now referred to as “the great decoupling.” AI Overviews significantly reduces CTR because it answers many search queries directly. In addition, sources are hardly visible, which further reduces organic traffic. In some cases, particularly high-quality content is more visible because AI uses this content for summaries, which generates impressions. This “content scraping” by AI means that high-quality content from primary resources (such as universities and research institutions) is reused without attribution. Negotiations are already underway whether this practice is still legally compliant or constitutes a copyright infringement.
SEOs now face new and additional tasks in optimizing content for these AI overviews.
It is clear that dependence on SEO for websites is changing because user behavior is changing radically. Of course, there are also prophets who regularly predict the death of various topics. Performance marketing is dead, SEO is dead, etc. The fact is: Google is still the dominant traffic provider for many websites. At the TYPO3 Developer Days 2025, I gave an extensive talk on developments and possible measures.
I would be happy to show your team how you can respond to these developments in a training session.
First-Party-Data
Tracking consent is being partially replaced by “tracking prevention” in browsers and devices, the lifespan of cookies is being limited, and tracking pixels are being deactivated. Browsers and the W3C consortium are also actively combating fingerprinting. The constant back-and-forth regarding Google's “third-party data” is also causing uncertainty about the future of web analytics.
The current trend is toward first-party cookies and cookieless tracking.
First-party cookies continue to be accepted by market participants as technically necessary and useful. However, it is important to ask whether setups with Stape.io (server-side tagging to circumvent browser tracking prevention) are really in line with the DPA.
Lively discussions about how to deal with the constantly rising prices of advertising networks do not change the need for accurate tracking of advertising expenditure. UTM parameters are truncated, cookies are automatically deleted, and fingerprints are falsified. Google's promise that “AI will take care of it” has not yet been fulfilled. Otherwise, they would not be so stoically clinging to third-party cookies in Chrome.
It remains a “cat-and-mouse game.”
To deal with this, I recommend a tracking setup that is as flexible, independent, and agile as possible. Those who rely on only one solution can be painfully affected by unexpected disruptions.
Imagine:
You spend weeks and months working on an expensive advertising campaign, invest tens of thousands of euros, and then, in the middle of the campaign, the tracking fails or is disrupted by crawlers, AI agents, etc. If you rely exclusively on one solution, you can quickly end up throwing a lot of money down the drain.





