The biggest problem with Stack Overflow is that both the questions and the answers age poorly. A question that was asked and answered for Android 7 is no longer relevant for Android 14. As a result, many, if not most top answers are no longer relevant.
I find these days that the answers I'm looking for are increasingly answers that have snuck in during the brief period before questions are flagged as duplicates. For most original questions, the top ten posts are so old that they are no longer relevant.
StackOverflow would be so much better if questions that were flagged as duplicates were not closed for further answers. Flagging a question as a duplicate should be an advisory, not a death knell.
There's also a huge problem with questions that are not duplicates being flagged as duplicates, because there's no penalty for editors being wrong, and no easy way to appeal that decision. Determining whether an answer is a really duplicate often involves close reading of both the original and the new question -- something that seems not to happen in the majority of cases. The problem is compounded because editors have a direct incentive to flag non-duplicates as duplicates. And so, the quality of answers continues to decline.
As a concrete example, my StackOverflow account has accumulated about 5,000 points for an answer I wrote for Android 7 that describes diagnostics that were changed in Android 8. Nevertheless, it continues to accumulate about 100 points a month even today. I can't be bothered to fix it, because it would involve significant research. The contents of the diagnostic results aren't documented. I had to go to Android source to figure out what was actually being reported. And it's not a topic that particularly concerns me at present. If someone were to go to the trouble of analyzing Android 14 sources to update my post, I doubt they would ever be able to overcome the very significant number of upvotes my answer over the last 8 years.
> A question that was asked and answered for Android 7 is no longer relevant for Android 14.
SO power (top 0.07%) user here. Here's how I handle that:
- If adding a new answer to an old question, add a headline at the top of your new answer explaining that it's new for Android 8, post 2017, ES6, whatever else.
Here's a JS vs jQuery answer. 'Trending' shows the 2014 (new) plain-JS answer before the older (2011) jQuery answer:
- Generally this should be enough to handle fixing things, but if you still use SO and are aware you have an old answer, and don't want to update it, add a headline saying 'This is for Android 7, pre 2017, ES5 and older' etc.
So true. I feel like I used StackOverflow so much when I started programming, and now it’s rarely even the most useful result on Google for the question I’m trying to answer. I struggle to imagine a scenario now where I’d want to rely on fossilized SO instead of GPT cross-referenced with official documentation.
I think it's easy to forget that as we get better at programming, we tend to get better at either avoiding problems or finding our own answers in which case SO is also less useful because we've learned to fish.
I had this realization a few years ago compared to other peoples' general search strategies.
An MIT professor on OCW (intro to machine learning) said something that has stuck with me - in paraphrase: "In order to learn something we must give it a name."
Now I just intuit which search terms work and can scan the results to find leads. A lot of the results and examples take coaxing to work, but sometimes you find that diamond that perfectly addresses your problem.
Yep. A decade into my career, the problems I tend to face are often ones that aren’t covered on SO, and probably wouldn’t be “appropriate” for SO, either. Something something ‘opinion-based’.
I've found the version issue to be significant on SO and in documentation in general when accessed via a search engine. Sometimes, it's "does this solution still work at all?". Other times, it's "is there a better or preferred way to solve this in later versions?"
For SO, seems a "version applicability" option could go a long way to solving this.
SO has basically been obsoleted by LLMs at this point. Why even visit the site? They are clearly not interested in improving the UX for either questioners or responders, or the 'Closed as duplicate of a question that somebody answered poorly 8 years ago' policy would have been changed or scrapped long ago.
One obvious question, of course, is how the LLMs are going to keep themselves up to date. It's not as if Stack Overflow is doing that, given the points raised by the parent. It seems that we're headed for stasis unless a new developer Q&A paradigm emerges.
Um, yes? That's kind of my point. LLMs trained on SO answers from a time before SO sucked as badly as it does now. Where will next-gen LLMs get their training, without a good open Q&A site to leech from?
>At some point that “helpful” vibe changed. Stack Overflow ceased being a question and answer site, and begun calcifying into a documentation site. Questions are answered with “see this other answer”.
>Maybe it’s the fate of all Q&A sites to eventually become documentation sites. If you answer the question “correctly” once, why bother repeating yourself?
You have misunderstood the editorial intent of Stack Overflow because it was always meant to be "documentation" more than a service to answer any questions including repeated/duplicated questions. This misunderstanding isn't your fault because most people visiting that site misunderstand it.
StackOverflow seemed to be "more helpful" in the beginning because the volume of content (aka "documentation") was still being seeded and built out by the new questions. Being helpful was a side effect of the site not having much content.
But now that there's a ton of content, a new question is more likely to be cross-referenced to a previous answer and therefore "Closed as duplicate" or some other reason. Yes, that seems a little rude and "unhelpful". And people don't realize that's the way Stack Overflow is designed to work.
EDIT to also add comment from co-founder Spolsky: "In our equation, we are a community of people writing answers that will be read by hundreds or thousands of people. Ours is a project more like wikipedia" from https://hackernews.hn/item?id=3656581
This is correct. It's also a problem. SO is still useful, but its utility has seriously diminished over time because of this approach. Too much of what's there is incorrect, outdated, or questions that aren't addressed elsewhere are unanswered because someone mistakenly thought it was already answered.
So would be much more useful if there were some effective mechanism to get rid of or correct entries that were answered incorrectly in the first place or have become incorrect or no longer relevant with time.
>So would be much more useful if there were some effective mechanism to get rid of or correct entries that were answered incorrectly in the first place or have become incorrect or no longer relevant with time.
Sometimes, the community will "fix" that by having a newer answer get upvoted beyond the "accepted" answer. E.g. a Python 3.2 answer with more modern syntax in 2014 gets upvoted above the Python 3.1 accepted answer in 2010:
But that type of "correction" only happens on popular questions. The more obscure questions and answers that are wrong will remain wrong because there are not enough eyeballs and/or widespread expertise to fix them.
I answered a popular-ish Ruby on Rails question in 2014 and it's still 3 votes shy from overtaking the outdated top answer from 2009. I don't even know if my answer is the best today because I haven't done Rails in years.
I definitely think this is a major pain point in the platform. I don't think that doing away with duplicates is the answer – instead there should be a way to downrank outdated stuff. Or something. I think they were experimenting with some approaches, but I haven't really been active in the last few years.
I've read that, but I disagree. Stack Overflow would be better for the community if it allowed repetition. Repetition isn't harmful, and it lets the next generation learn how to answer questions, which is a legitimately hard skill, and I'm arguing is more useful than creating a static repository of documentation.
>would be better for the community if it allowed repetition. Repetition isn't harmful,
The "community" also includes the people who volunteer to answer the questions. The answerers don't want repetition. Jeff Atwood wanted to prioritize the SO experience for the answerers.
To paraphrase JA's reasoning:
- asking questions is easy which can lead to an increasing volume of bad questions that degrades the site for the answerers
- answering questions is hard and more valuable for StackOverflow so optimize the site for them. (JA wrote, ">We feel that the world is awash in questions, but not answers. Answers are the real unit of work in any Q&A; system. Therefore, the only logical thing to do is to maximize the happiness and enjoyment of answerers."[1])
I do understand how you would want SO to work differently and I'm not debating your opinion. I'm just relaying how the cofounders wanted StackOverflow to work.
Also note that all products, especially successful ones, change (anyone used Excel lately?) and the cofounders are all long gone. They built something that was temporally valuable, (rightfully) cashed in and moved on.
No, asking bad or thoughtless questions is easy. Asking a good question, complete with "this is what I've tried" and snippets and links to all revelant resources like API documentation, CodePens/Godbolts/standards and a summary of your thinking that has led to needing an answer in the first place is very hard.
A good question is probably at least an hour's work, and there are very few easy questions left that don't need substantial work even to frame.
An answer can be very in-depth with research and analysis, which also takes a long time, but it can also be anywhere on the spectrum down to fairly straightforward "you missed a function call" answer.
On the other hand, writing a good question sometimes solves your problem half way through when your Minimal Viable Example suddenly starts working and your realise the flaw, or the rubber-duck nature makes you think it through in a new way.
I think this assumption is not true anymore - ChatGPT can give an answer in less time and (human) effort than it takes to formulate and devise a question. So it really is about creating a set of curated question-answer pairs, where both the questions and the answers can be auto-generated. In this context, asking the same question different ways is really helpful to an AI. And there is no such thing as a bad question, there are just Q-A pairs that do not contribute significantly to training the model ("boring" questions). I've noticed Quora is experimenting with ChatGPT but they aren't at the point where you can flag the AI's answer as incorrect or misleading and fix it.
Well, the question-asker should be able to tell if ChatGPT's answer is incorrect. Most obviously, for software, they try it and it doesn't work, or for scientific questions, they look up the references and they don't exist. It is probably even possible to automate such checking to some extent. But similar issues apply to checking human answers - at some point, the answer is correct because it sounds plausible, not because it has been verified.
If people are posting answers that they don't know are correct, putting the burden on the asker to do that vetting, then the entire point of the site is moot.
News flash, but a lot of stackoverflow answers are garbage - incorrect, out of date, etc.. Does this make it useless? No, you just have to sift through.
Repetition is harmful. I've seen it play out on many expert forums (including an M:tG forum with several Level 2 Judges): sooner or later, depending on the person's stress-resistance, but always, they get too tired of answering the same questions over and over and over and over again. And manifestations of that phenomenon aren't pretty: they either leave the forum entirely, or turn quite acidic in their tone. In either case, their usefulness as the anwererers becomes quite low.
Repetition is harmful, stale answers are harmful, growing the results forever is harmful, searching for good old answers that are still valid is difficult, prioritizing newer answers means that the good old ones are harder to find, etc., etc., etc. Every forum, wiki, and growing documentation base eventually stagnates, overflows, or dies some other way. Wikipedia is one exception that mostly stays up-to-date.
What if people just don't answer the same questions over and over? Old answers can be linked, but the whole point of allowing repetition is to get fresh answers, ideally from new people. People going insane repeating themselves is a separate issue that can be solved in much better ways than prohibiting duplicate questions.
Serious question: how many answers have you posted on Stack Overflow?
Because people standing from the sidelines telling me what I should and shouldn't be doing with my spare time leaves more than a little bad taste in my mouth.
So basically you have no experience what it's like to be answering questions? Most answers come from a fairly dedicated group of people who check the site most days, not from the occasional strays.
I'm not saying that people answering questions are always right or always have the final say, but it's easy to have strong opinions on what's "better for the community" if you're not the one doing the legwork or have to deal with the fallout from that.
I certainly would never be saying that sort of stuff unless I had put in the work. My balls just aren't big enough I guess.
> This misunderstanding isn't your fault because most people visiting that site misunderstand it.
And the reason few people understand it is because it's not logical. If you want documentation, read the documentation, and in most cases, that's what they did before asking the question. Anything more than cut and paste of the official documentation is discussion. "Well what is usually done to make this work in practice..." Nope. That's discussion, and to 99.9% of people asking for help, it's why such a site exists.
Do I have evidence? Sure, look at all the questions that have moved to Reddit, Discord, etc.
As someone who was very active answering questions on SO, I can say that duplicate or low-quality questions were a pretty big issue on the site. It might be fun to answer “what does * do in a Python function call” once; it’s quite another to answer that same question a dozen times. “Mark as duplicate” is an important way to keep the workload down, and to centralize questions around a single high-quality source of answers.
People also forget that answers are not set in stone. Answers are editable - if an answer needs to be updated for the latest version of a framework, or it’s been obsoleted, just add that to the answer! Low-rep users can suggest edits, while higher-rep users can just go in and edit them. Its not uncommon to see the top answer have an edit pointing at another answer for some newer version.
Centralizing around a single deduplicated answer also has a major benefit: multiple answers can gather on a single question. It’s not always the case that one particular answer works for everyone, so try different answers and see what works (and upvote the ones that do work!)
> if an answer needs to be updated for the latest version of a framework, or it’s been obsoleted, just add that to the answer!
Depending on how much of a change it is, that might also be discouraged. Adding a disclaimer about being for an old version is generally fine, rewriting it for the newest version is a big no-no - that's when it should be a new answer.
> Answers are editable - if an answer needs to be updated for the latest version of a framework, or it’s been obsoleted, just add that to the answer!
Yeah, I tried that, and they've always been reverted without explanation. Best you can do is comment and hope someone reads it. I think that's more common today, since it doesn't take long to realize the comments are where the useful information is found.
Plus, if the change is significant, it's essentially a different solution entirely. There may be alternative solutions which would vote higher and comment relevance/quality would be impacted, etc. It starts to become unwieldy at a certain point.
Better that solutions can be versioned and possibly linked together for findability.
Tech changes (faster now than ever). This will allow newer answers to float to the top and make older questions more obviously replaceable, providing space for beginners to ask questions better couched in today's problem/solution space.
Personally, I find so many of the package-specific answers for Python are out of date. They don't work and/or are not written to current best practice.
One of the biggest problems with Stack Overflow, and this applies to Reddit as well, is that it's not a viable business model. There's nothing monetizable about the "core business" of asking questions and getting answers. If there was, Quora or Experts Exchange would have prevailed instead.
Finding the right nonprofit structure to handle this need is difficult. And you could always end up like Wikipedia instead, which seems to spend 90% of its donations on things other than its operations, and yet cries that it will shut down if you don't donate your next latte (but I don't drink lattes...).
A serious problem with Stack Overflow is it allows only one right answer. As time passes, in cases when the question author haven't specified an exact version of a library in the question title (which is the case more often than it is not), better solutions become possible. And the old solution can become impossible. Yet the initially accepted answer to the question remains chosen forever.
Sure, but there’s more than one answer listed. If the right answer isn’t clearly up to date, try a different one, or sort by recency if you want to see newer answers.
And, if the accepted answer is hopelessly outdated, suggest an edit to add that caveat so others know to skip it.
People rarely care to provide new answers to old questions yet you can be unable to post a new question because it's exactly the same as the one asked and answered a decade ago. You just need a solution relevant to the present day but no, you can't ask for it.
Thinking the accepted answer is the "right" answer is a common mistake made by people who aren't aware of how Stack Overflow works.
>Accepting an answer is not meant to be a definitive and final statement indicating that the question has now been answered perfectly. It simply means that the author received an answer that worked for them personally.
Okay but is this a failure of the user or an unintuitive UX? We can't just expect to pass cognitive burden onto users in a million places. UI affordances and conventions exist for a reason.
I imagine that's what causes the confusion. The big green check is a Q&A site model ("this worked for me") but the site is intended as a documentation site ("this is how you do it"). It's a bit strange to have one question closed as a duplicate of another one that has an answer that only worked for the other question asker.
Yet even highly upvoted (because useful and interesting) get deleted often (fortunately my points let me see the deleted) for sake of some bullshit rules.
Stack Overflow was never about helping people out; right from the start.
It was a response to searching the internet trying to find a solution to your problem, ending up with a forum thread with three pages of discussions, and the OP coming back with "nvm i figured it out", but no solution. It was always about long-term documentation. Go read some of Jeff and Joel's old stuff from 2008-2009; it uses exactly those kind of examples.
And stuff like this:
> I think repetition of answers is healthy.
Yeah ... try answering questions for a few months and come back to me on that. I was never on Stack Overflow to be an unpaid tutor, to "teach the next generation", or to be a free debugger-as-a-service. Most people aren't.
And no doubt Stack Overflow is far from perfect, and that at times people interpret the rules (far) too strictly, but this is the same ol' bollocks that's been doing the rounds since 2008, mostly propagated by people who have rarely spent any time answering questions.
The problem I see is the SO is a substitute for well written documentation, rather than an incubator for quality technical writing.
It depends on open source docs being poor so noobs need experts to educate them. But it does not place a burden on the experts to improve the original docs.
For example, Python peps and standard library docs are often painfully and needlessly laborious, such that it's often easier to just read the code. Similarly, the git manual is written to induce episodes of acute madness.
The SO solution? Read a thousand piecemeal SO answers, get the job done in a slapdash way, get paid. Don't worry about the inaccessibility, waste, and exponentially accruing technical debt of the whole enterprise.
Priests of this cult often avoid reading the docs, and never pick up the responsibility to improve the awful docs.
The SO high priests will gladly regurgitate their knowledge like mother birds, often in writing as bad as upstream docs. Typically defending and normalizing the inaccessible upstream docs.
It's an accessibility nightmare.
You end up with millions of inefficient and insular tidbits whose sum is somehow less than the whole. It's designed to make users dependent, rather than free them.
It feels like there was a time where SO could have become more. But doing so would have lessened use reliance on SO, and required answerers to improve upstream docs. They chose to be self contained and derivative, rather than transformative to upstream projects.
It will be dismantled.
Now SO has so much entropy that it's only good for AI reprocessing into docs for upstream projects.
Thankfully much of SO knowledge is now bundled into LLMs. So getting actually legible docs is usually as easy as pasting the project docs and asking for an organized and straightforward docs.
The challenge I see is closing the loop and ensuring upstream docs improve. The culture of inaccessibility is too entrenched for SO to do this - the org's revenues would have to collapse. But AI can work to the advantage of an accessibility focused upstart. They can liberate, mobilize and reintegrate the captive knowledge of the SO corpus. They can push it back into open source project docs.
And in cases where upstream prefers bad docs? The upstart should fork the project and take responsibility for project accessibility.
This seems like a rose colored look at SO. I'm sure the candor was a little bit more pleasant than usenet, but I'm not sure it was ever a free flowing Q&A. From the beginning the founders were pretty clear that a "read the faq" culture was worth encouraging on the site, not coincidentally because that aligns well with SEO.
Up until 2010 or so I used to be pretty active on SO. It was Jon Skeet, Marc Gravelli then me on the top users. I stopped for a variety of reasons, some of which are completely unrelated to SO but here were several big issues I could see then (and I commented about on Meta SO at the time):
1. Moderators were already getting out of control. They had decided among themselves that questions without a provable answer needed to be purged from the site despite them having clear value. A question like "Should I use Java or C++ for X?" can have an answer like "These are the benefits of each and things you should consider". By 2010, that was an automatic "closed, not constructive".
2. Users didn't understand what was and wasn't a duplicate. Two questions may sound simimlar but one important detail can ccompletely change the answer.
3. SO had its most value when all the information was current but as time goes on answers become no longer current and I didn't know how that would be handled. A correct answer in Java 7 might be incorrect in Java 14. I believe this is still handled haphazardly and is a huge problem with, say, Android;
4. The system rewards low-hanging fruit and pretty much discourages any complex question or answer. A complex but legitimate question might be closed as being too specific. It's less likely to find an answer and fewer people understand the answer so won't upvote it (or, worse, will upvote the wrong answer that sounds right).
5. It suffers the problem that all sites do that require users to effectively to rank answers (be that with upvote/downvote, liking or whatever) and that is that people vote for what they like, not what it is correct. Post an objective question about an issue in C++ and a provably correct answer that is perceived to be negative of C++ will attract downvotes from C++ devotees. This applies to SO, reddit, social media sites, HN, etc. This is a pretty negative experience for the answerer.
Forums suck because they're time-ordered. SO's big value was that answeres were net vote ordered so the top answer was often (but not always) the best and/or correct.
I wouldn't necessarily call SO "documentation". "Ossified opinion" would be more accurate in a lot of cases.
But StackOverflow suffered the issue every site that relies on user contributions/moderation does: it ultimately caters to those with the most free time, not those with the most expertise.
And I don't think there's an easy way around the problem.
The quote I come back to repeatedly is "the bureaucracy is expanding to meet the needs of the expanding bureaucracy" as it perfectly describes the Moderator Problem.
There are people who ask questions. You need questions but questions come with an audience. You can argue a well-crafted question is better than a bad one but it's not really an issue. The site only has value if the answers are there and they're good. This is the real value of answerers.
In my mind, moderators are the janitors. Yes, that job is important. You need to clean up tags, fix misspellings or bad grammar, remove low value questions and answers but the scope really is pretty limited. But there is a certain personality type that gets attracted to moderation that, if left unchecked, is completely toxic. Instead of cleaning up answers they decide to set policy. "Subjective" questions were one of the first big targets.
It requires constant vigilance for the moderators not to get completely out of hand. Wikipedia is a really great exmaple of this.
Toxic moderators love solving imaginary problems. The argument for closing "subjective" questions was they didn't want it to descend into a flamewar. Thing is, that never happened. And if it did, you just delete the relevant comments and answers. There is absolutely no need for a site-wide ban on "subjective" questions. It's a complete non-problem.
Stack overflow exists because no one is teaching people how to use documentation properly. Many, many questions people have, have great answers, including context and further reading, if they actually found the documentation.
First-party documentation quality varies wildly, and in general has fallen off a cliff from the days when everything had a comprehensive, multi-volume, doorstopper manual.
There's also a lot of tinkering and breaking things that is necessary to develop a deeper understanding of a piece of technology, without which the manual is far less useful. Knowing where to look in the official documentation is half or more of the battle. Modern attention spans, planning expectations, systems complexity, etc. are not geared towards developing this deep understanding.
Moreover, the number one tool developers who cut their teeth over the past ~20 years have relied on to help them find the right answers, Google Search, has degraded massively in quality lately. Stack Overflow arguably breathed new life into search engines for a while, until SEO-hacked junk overtook it. Who can forget the classic forum posts "I fixed the problem [no details]" or "Why don't you Google it [even though this is the only meaningful Google result]". Even Stack Overflow runs into those issues, sometimes, though.
For many large software projects there is great, searchable and comprehensive documentation. If you know where to look, te answer to your question is most likely already there.
Of course there are questions not answered in the documentation, which is where stack overflow can shine.
I think the real problem is google, if it was a great search engine it would give you the appropriate documentation page for your problem and many, many people would never visit stack overflow for the answer.
Even when the quality of official documentation is high, there are other uses for an SO-like site, I think:
- Handling corner cases that the original software developers never thought of, or their technical writers never thought to document, at least.
- Contextualizing, summarizing, and rephrasing the official docs in ways that people at different levels of familiarity and from different backgrounds find easier to understand.
- Providing errata and implementation notes that have been discovered over time by the users, before the official source is able to add them, or especially if the official source refuses to add them.
“Goddamnit, I forgot the syntax for doing a for…in and assigning both key and value in this language, yet again, and my first two guesses were wrong”
Searches, clicks docs page, greeted with 20,000 words about flow control in the language, only the eighth example of for loops has what I want
Searches, lands on SO, what I need is one screenful below the fold, done
The only strong counterexample to this pattern I can think of is everyone’s least favorite language, PHP, because of its combo of narrow-focused docs pages and community posts on each filling in any gaps and pointing out any weirdness or features that exist but should be avoided.
Very curious. To me the opposite happens, if I forget syntax all I want to look at is a simple example, which is often easily found in the official documentation
tried in one famous doc site: Python
search "for key value"
We get a more generally worded result only at position #12 (vs. #1-2 in a search engine)
And then see the excessive verbiage (though admittedly not the exagerrated 20,000)
> The for statement in Python differs a bit from what you may be used to in C or Pascal. Rather than always iterating over an arithmetic progression of numbers (like in Pascal), or giving the user the ability to define both the iteration step and halting condition (as C), Python’s for statement iterates over the items of any sequence (a list or a string), in the order that they appear in the sequence. For example (no pun intended):
Indeed, the difference to Pascal is what people are looking for as well as pun clarifications obscuring a single short sequence of substance
I find these days that the answers I'm looking for are increasingly answers that have snuck in during the brief period before questions are flagged as duplicates. For most original questions, the top ten posts are so old that they are no longer relevant.
StackOverflow would be so much better if questions that were flagged as duplicates were not closed for further answers. Flagging a question as a duplicate should be an advisory, not a death knell.
There's also a huge problem with questions that are not duplicates being flagged as duplicates, because there's no penalty for editors being wrong, and no easy way to appeal that decision. Determining whether an answer is a really duplicate often involves close reading of both the original and the new question -- something that seems not to happen in the majority of cases. The problem is compounded because editors have a direct incentive to flag non-duplicates as duplicates. And so, the quality of answers continues to decline.
As a concrete example, my StackOverflow account has accumulated about 5,000 points for an answer I wrote for Android 7 that describes diagnostics that were changed in Android 8. Nevertheless, it continues to accumulate about 100 points a month even today. I can't be bothered to fix it, because it would involve significant research. The contents of the diagnostic results aren't documented. I had to go to Android source to figure out what was actually being reported. And it's not a topic that particularly concerns me at present. If someone were to go to the trouble of analyzing Android 14 sources to update my post, I doubt they would ever be able to overcome the very significant number of upvotes my answer over the last 8 years.