Google search results have gone shit. I am facing some deindexing issues where Google is citing a content duplicate and picking a canonical URL itself, despite no similar content.
Just the open is similar, but the intent is totally different, and so is the focus keyword.
Not facing this issue in Bing and other search engines.
I've also noticed Google having indexing issues over the past ~year:
Some popular models on Hugging Face never appear in the results, but the sub-pages (discussion, files, quants, etc.) do.
Some Reddit pages show up only in their auto-translated form, and in a language Google has no reason to think I speak. (Maybe there's some deduplication to keep machine translations out of the results, but it's misfiring and discarding the original instead?)
Reddit auto translation is horrible. It’s an extremely frustrating feeling, starting to read something in your language believing it’s local, until you reach a weird phrase and realise it’s translated English.
It’s also clearly confusing users, as you get replies in a random language, obviously made by people who read an auto translation and thought they were continuing the conversation in their native language.
I will google for something in French when I don't find the results I want in English. Sometimes google will return links to English threads (that I've already seen and decided were worthless!) auto-tranalated to French. As if that were any help at all..
The issues with auto-translated Reddit pages unfortunately also happens with Kagi. I am not sure if this is just because Kagi uses Google's search index or if Reddit publishes the translated title as metadata.
I think at least for Google there are some browser extensions that can remove these results.
The Reddit issue is also something that really annoys me and i wish kagi would find some way to counter it. Whenever I search for administrational things I do so in one of three languages, German, French or English depending on which context this issue arises in. And I would really prefer to only get answers that are relevant to that country. It's simply not useful for me to find answers about social security issues in the US when I'm searching for them in French.
Check that you're not routing unnamed SNI requests to your web content. If someone sets up a reverse proxy with a different domain name, google will index both domains and freak out when it sees duplicate content. Also make sure you're setting canonical tags properly.
Edit: I'd also consider using full URLs in links rather than relative paths.
Canonical Tags are done perfectly. Never changed them, and the blog is quite old too. I found a pattern where Google considers a page a duplicate because of the URL structure.
For example:
So, for a topic, if I have two of the above pages, Google will pick one of them canonically despite different keyword focus and intent. And the worst part is that it picks the worst page canonical, i.e., the tag page over blog or blog page over service.
I did that. But I am not sure what to do with blog and service pages. Can't tell Google to off one of them. For now, I have priortized service page as canonical.
B.c they shifted their internal KPI in 2018 roughly, to keeping users on Google and not tuning towards users finding what they are looking for ie. Clicking off google.
This is what has caused the degradation of search quality since then.
Just the open is similar, but the intent is totally different, and so is the focus keyword.
Not facing this issue in Bing and other search engines.