HN2new | past | comments | ask | show | jobs | submitlogin

The context matters. I'd happily read "Top 10" lists on a website if the site itself was dedicated to that one thing. "Top 10 Prog Rock albums", while a lazy, SEO-bait title, would at least be credible if it were on a music-oriented website.

But no, these stories all come from cookie-cutter "new media" blog sites, written by an anonymous content writer who's repackaged Wikipedia/Discogs info into Buzzfeed-style copy writing designed to get people to "share to Twitter/FB". No passion, no expertise. Just eyeballs at any cost.



This got me thinking that maybe one of the other big reasons for this is that the algorithms prioritize newer pages over older pages. This produces the problem where instead of covering a topic and refining it over time, the incentive is to repackage it over and over again.

It reminds me of an annoyance I have with the Kindle store. If I wanted to find a book on, let's say, Psychology, there is no option to find all-time respected books of the past centenary. Amazon's algorithms constantly push to recommend the latest hot book of the year. But I don't want that. A year is not enough time to have society determine if the material withstands time. I want something that has stood the test of time and is recommended by reputable institutions.


This is just a guess, but I believe that they use machine learning and rank it by the clicks. I took some coursera courses and Andrew Ng sort of suggested that as their strategy.

The problem is that clickbait and low effort articles could be good enough to get the click, but low effort enough to drag society into the gutter. As time passes, the system is gamified more and more where the least effort for the most clicks is optimized.


It sounds like the problem is, the search engine has no way to measure satisfaction after the click.


But they have. or could have. At least Google (and to a smaller extend Microsoft), if you are using Chrome/Bing have exactly that signal. If you stay on the site and scroll (taking time, reading, not skimming) all this could be a signal to evaluate if the search result met your needs.


I've heard google would guess with bounce rate. Or another way, if the user clicks on LinkedIn website A, after a few moments keeps trying other linksw/related search. It would mean it was not valuable.


They tried to insight this with the "bounce rate"


> is that the algorithms prioritize newer pages over older pages.

They do? That would explain a lot - but ironically, I can't find a good source on this. Do you have one at hand?


It is pretty obvious if you search for any old topic that is also covered incessantly by the news. "royal family" is a good example. There's no way those news stories published an hour ago are listed first due to a high PageRank score (which necessarily depends on time to accumulate inbound links).


It depends on the content. The flip side is looking up a programming-related question and getting results from 2012.

I think they take different things into account based on the thing being searched.


Even your example would depend upon the context. There are many cases where a programming question in 2021 is identical to one from 2012, along with the answer. In those instances, would you rather a shallow answer from 2021 or an indepth answer from 2012? This is not meant to imply that older answers offer greater depth, yet a heavy bias towards recent material can produce that outcome in some circumstances.


If you're using tools/languages that change rapidly (like Kotlin, in my case), syntax from a few years ago will often be outdated.


Yes, yet there are programming questions that go beyond "how do I do X in language Y" or "how do I do X with library Y". The language and library specific questions are the ones where I would be less inclined to want additional depth anyhow, well, provided they aren't dependent upon some language or library specific implementation detail.


There are of course a variety of factors, including the popularity of the site the page is published on. The signals related to the site are often as important as the content on the page itself. Even different parts of the same site can lend varying weight to something published in that section.

Engagement, as measured in clicks and time spent on page, plays a big part.

But you're right, to a degree, as frequently updated pages can rank higher in many areas. A newly published page has been recently updated.

A lot depends on the (algorithmically perceived) topic too. Where news is concerned, you're completely right, algos are always going to favor newer content unless your search terms specify otherwise.

PageRank, in it's original form, is long dead. Inbound link related signals are much more complex and contextual now, and other types of signals get more weight.


Your Google search results show the date on articles do they not? If people are more likely to click on "Celebrity Net Worth (2021)" than "Celebrity Net Worth (2012)", then the algo will update to favour those results, because people are clicking on them.

The only definitive source on this would be the gatekeeper itself. But Google never says anything explicitly, because they don't want people gaming search rankings. Even though it happens anyway.


The new evergreen is refreshed sludge for bottom dollar. College kids stealing Reddit comments or moving around paragraphs from old articles. Or linking to linked blogs that link elsewhere.

It's all stamped with Google Ads, of course, and then Google ranks these pages high enough to rake in eyeballs and ad dollars.

Also there's the fact that each year, the average webpage picks up two more video elements / ad players, one or two more ad overlays, a cookie banner, and half a dozen banner/interstitials. It's 3-5% content spread thinly over an ad engine.

The Google web is about squeezing ads down your throat.


Really makes you wonder: you play whack a mole and tackle the symptoms with initiatives like this search engine. But the root of that problem and many many others is the same: advertising. Why don't we try to tackle that?


Perhaps a subscription-based search engine would avoid these incentives.


Let’s go a few levels deeper and question our consumption culture


Exactly.

The only reason people make content they aren't passionate about is advertising.


> This got me thinking that maybe one of the other big reasons for this is that the algorithms prioritize newer pages over older pages.

Actually that's not always the case. We publish a lot of blog content and it's really hard to publish new content that replaces old articles. We still see articles from 2017 coming up as more popular than newer, better treatments of the same subject. If somebody knows the SEO magic to get around this I'm all ears.


Amazon search clearly does not prioritize exact title matches.


Its the "healthy web" Mozilla^1 and Google keep telling their blog audiences about. :)

1 Accept quid pro quo to send all queries to Google by default

If what these companies were telling their readers was true, i.e., that advertising is "essential" for the web to survive, then how are the sites returned by this search engine for text-heavy websites (that are not discoverable through Google, the default search engine for Chrome, Firefox, etc.) able to remain online. Advertising is essential for the "tech" company middleman business to survive.


I'm not sure I agree with your example. It seems to me it is the exact same as a "Top ten drinks to drink on a rainy day" list. There's simply too many good albums and opinions differ, so a top ten would -just like the drinks- end up being a list of the most popular ones with maybe one the author picks to stir some controversy or discussion. In my opinion the world would be a smarter place if Google ranked all such sites low. Then we might at least get fluff like "Top ten prog rock albums if you love X, hate Y and listen to Z when no one is around" instead.


Google won't rank them low because they actually do serve an important purpose. They're there for people who don't really know what they want specifically, they're looking for an overview. A top 10 gives a digestible overview on some topic, which helps the searcher narrow down what they really want.

A "Top 10 albums of all time" post is actually better off going through 10 genres of popular music from the past 50 years and picking the top album (plus mentioning some other top albums in the genre) for each one.

That gives the user the overview they're probably looking for, whether those are the top 10 albums of all time or not. It's a case of what the user searched for vs what they actually really want.


"The best minds of my generation are thinking about how to make people click ads"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: