It's interesting that the HN community continues to make reference to the prospect of a decentralized internet when Git was built to be decentralized in the first place. In spite of this, we all have congregated around GitHub for the community and are shocked when the centralized source we've been using gets acquired by a company we don't trust. That's sort of the whole point of centralization, you can't trust it. Maybe this event will finally shift things back into a decentralized direction.
Centralization has an undeniable fundamental attraction. Everything, really, seems to naturally trend towards centralization, and then major scandals reverse the trend. We will probably swing back and forth between the two in perpetuity.
The whole darn internet was built to be decentralized. And yet we all use GMail, the same handful of DNS servers, the same short list of major trunk hubs, the same shrinking list of bitcoin mining pools, etc.
I've always found it too difficult to choose the "decentralized" option for many services. It's just not convenient enough to spin up my own gitlab server, or own email. I tried to do both of those things a while ago when I was still new to Linux, and gave up due to the number of steps involved which I inevitably fucked up.
The solution for me would be packages which hold my hand through the install process to make installing such software as easy as possible. Obviously, packaging software in this way would take way more work, and people qualified to do this would rather package more software rather than hold some noob's hand.
That being said, hopefully in the future when software gets even more mature, repackaging will become less necessary, and this type of packages might become more common, allowing decentralized systems to be easy enough to set up that they become common.
I saw someone post on Twitter saying they were tired of being their own sysadmin.
It's true - I could set up my own mail server, my own Git hosting etc etc, but why would I? I'd rather just pay Fastmail for email and GitHub for some private repos.
I'm my own sysadmin (for personal email, small HTTP server, an OpenVPN and a few other minor services for personal use) and while it did take me a non-negligible amount of time to get things running smoothly it now only requires a few minutes of attention every other week to apply security updates and make sure my backups are running smoothly. In exchange for that I have total control over my data and basically unlimited customization potential. I'm definitely not planning on stopping that.
I do use Github regularly however, but it's because of the "social network" aspect. If you want to interact with many open source projects it has to be on github nowadays. The network effect is strong.
Congratulations, you can configure your own mail server. You're already one in a million. Out of that group, maybe one in a hundred can do it nearly as securely as one of the big services. This is the reason why people use the big centralised services, and also the reason smaller ones don't survive: email is ridiculously over-complicated. You won't convince anyone it isn't by explaining that you managed to DIY in under a week.
Fortunately, rolling your own git isn't nearly as hard, and the percentage of people who could run a node compared to total users is larger. Hell, every person using git also has a fully-functional, secure and standard self-hosting stack already installed. Getting a GitHub-like web interface is similarly easy. This fundamentally reduces the risk of centralising, which conversely means that people don't have any qualms about using GitHub, since they can leave at any time. (Less true of issues, PRs, community, but hey.)
In either case, centralising has large benefits: one for handling complexity, one for giving you a community.
When you want decentralisation, it's because you don't want a single entity controlling what the ecosystem becomes. The circumstances where this is true are typically big utility-like things, where we care more about guaranteeing that water and electricity exist than the top end of providers' profits. Email is like this. You would never build your own power station, just like these days it would be bordering on negligent to use anything except Office365/GSuite/etc as your corporate email host. Lots of jurisdictions will create highly regulated marketplaces so you don't get Enron-California-style anti-competition. When things get that bad in Gmail-land, let us know.
[Edit: I was pretty snarky just there, but not at you! It was in general, I swear.]
I want to make that switch, but am rather intimidated by my inexperience in that domain. Are there any guides or tutorials you've used that you'd recommend?
Not a single tutorial comes to mind but you generally find a wealth of resources and tutorials online about setting up anything you might need.
Always start by setting up a strict firewall and then whitelist cautiously anything you need as you set up the rest. Don't bother with SELinux. Be very careful with your Postfix (or if you're masochistic, sendmail) config as it's rather (needlessly IMHO) intricate and it's easy to end up with a configuration that appears to work but is extremely broken. In particular if you end up configuring an open relay by mistake spammers are going to have a lot of fun with your server but you'll end up blacklisted everywhere.
My "stack" is nginx for web, postfix + dovecot + spamassassin for email (+ roundcube when I wanted a webmail). Postfix was by far the trickiest to configure, although if you find a good tutorial online and can spare a couple of hours to understand how it all works it's not too bad.
I have similar setup but use Exim instead of Postfix. And I am still glad I host my own services. Yes, it took time but it will work as I want it forever. Only lazy people prefer ready-made cloud services - yes, it's convenient - but it's more costly, you lose control over your data and they are constantly redesigning and reinventing something. You never know when they sell the business or increase prices, ban or limit you.
Have you tried mailcow? I find it strikes a pretty good balance between running your own mailserver while not getting overwhelmed with details.
It runs in containers and provides a webserver, webmail, caldav, spam filter, updater... I have made an ansible role that will back up to a Borg repository every hour as well as restoring the latest backup upon installation if you need inspiration: https://galaxy.ansible.com/coaxial/mailcow/
If you use your own domain name, you can set mailcow to sync with your Gmail account so you have all your emails in mailcow and then pull the plug on Gmail without losing anything. I find it runs pretty well for me on a server with only 2gb of ram.
If you are on Windows, the equivalent is smartermail. For a fee it also supports ActiveSync, which simplifies client configuration and gives pseudo-push.
For the same reason that you don't opt for a dictator even though it is more convenient than democracy (until it isn't, at which point there is little you can do about it).
Also, you could also instead pay people to develop easier solutions for self-hosting. Or you could buy a service from a smaller hosting provider. There is more than the extremes of "google is email" and "I have to hand-carve my mail server".
So you are suggesting it is a moral obligation to choose a smaller provider, even if the largest provider is better?
Centralization happens because someone ends up doing it better than everyone else, and so everyone chooses to use that provider and they become the dominant force.
I get the need for diversity, but as an individual, I am going to choose the best provider, even if they are the biggest one.
> So you are suggesting it is a moral obligation to choose a smaller provider, even if the largest provider is better?
Not moral but, I’ve noticed that things all seems to be tied to a pendulum that swings back a forth. Pushing in the opposite direction is required to attempt a balance.
Why is the larger one better? i use email mainly over IMAP and can't see the better in using Gmail. Yes, you should prefer smaller email provider. We live in the cyberpunk world now, corporations are the ones with power.
> So you are suggesting it is a moral obligation to choose a smaller provider, even if the largest provider is better?
Maybe, but the world really is too complex to break it down to such a simple question, it obviously is a tradeoff with many more factors to consider.
> Centralization happens because someone ends up doing it better than everyone else, and so everyone chooses to use that provider and they become the dominant force.
Well, but does it? I mean, no doubt such cases do exist, sure, but if you really look into how companies do become dominant, that is only one of many factors, and sometimes not even a necessary one.
> I get the need for diversity, but as an individual, I am going to choose the best provider, even if they are the biggest one.
Well, but how do you evaluate what "the best provider" is?
You might be comparing functionality, say. Or price. Or speed. Or any other property of the product as you could now choose to use it. And obviously all of those are important things to consider.
But my suggestion isn't that you should follow some abstract moral teaching because some ideology says that this is the right way, and the only right way. My point is that it may even be in our very own interest to choose a solution that is inferior in terms of current functionality/price/speed/whatever because there are long-term costs attached to the superior solution that actually make it more expensive, all things considered, than using the inferior solution now. So, arguably, the currently technically inferior option with a lower total cost would actually the better solution.
To maybe make it more practical, but without any claim to being realistic, the numbers are obviously just made up: Let's assume that using Gmail saves you 10 minutes every day vs. using Thunderbird. Now, Gmail is privately owned, so if everyone chose to use Gmail, they would effectively have the monopoly over email. At that point, they have every incentive to add proprietary functionality for the sole purpose of making interoperability difficult. Which could prevent a new competitor from entering the market what would invent a new email workflow that would save you a further 10 minutes every day. Now, does choosing Gmail actually save you time overall? And is Gmail the better product if using Thunderbird now would lead to you being able to save 20 minutes a days a few years down the road?
This isn't about some sort of diversity for diversity's sake, this is about which of those options actually is in our very own long-term interest, and monopolies have a strong tendency to be very much not in the interest of the customer.
So, really, if anything, I would suggest that there is a moral obligation to watch out for people/organizations accumulating too much power and to prevent them from obtaining it if the long-term damage that that concentration of power can do is worse than the short-term benefits obtained from using their offerings.
> Well, but how do you evaluate what "the best provider" is?
Well, in the case of GH vs. GL. Existence for one (GH actually existed before GL and was usable since the start), not having to self-host anything for second, everyone using it for third.
> this is about which of those options actually is in our very own long-term interest
Based on your own example, if I have previously used TB for a year and then migrate to the new product that saves me twenty minutes the sum is ten minutes saved and a lot of nerves lost due to slowness before.
If I had used gmail during that time it's going to take the same amount of time I've already used it for the benefit to zero out, if I kept using it until TB became good and then migrate I've only won.
As an user I already get a lot of bad UX, I do not want more voluntarily.
> if I kept using it until TB became good and then migrate I've only won.
You are assuming that you have that option. Once a monopoly is established, you don't have that option anymore. Also, while you support one solution, you decrease the chances of other solutions succeeding. If you pay licence fees to Microsoft instead of buying Redhat boxes, that has an impact on whether Redhat is better a year from now.
Decentralized doesn't mean you have to host everything yourself, there are many other companies to pay to host things outside your server. You don't want everything hanging centralized on your hardware anyway, if your server dies, everything is down.
That's true, except it is not common for a server to "die" every second day. It is not a problem to have uptimes in years. But due to updates it is reasonable to have a few minute downtime every few months. If you require 100% high availability (which only handful of people and companies really need) then you can set up a cluser on Kubernetes/Docker.
Eh? It’s literally three shell commands to install GitLab. One to add their GPG key, one to add the upstream repo, and one to apt-get install the package. I bet there’s a curl | bash script to do all three as “one step”.
If you’re going to complain about something, at least pick something legit like managing backups or server uptime. The install itself couldn’t be simpler.
100% agreed. Maintaining a self hosted web service is a demanding task. It's not about updating to get some new features right when they're out - it's about security, and it's a real pain. The installation can get messed, so what? You're not using the tool yet. You can get a new VM. But when you're updating the machine the holds your code (heck, even GitLab.com had their issues with updates!), or the machine that holds your mail service - things get scary. Installation is not at all the issue.
The approach in taking to self hosted services is that if it's down for an hour or even a couple of days it's fine since I'm the main user.
For disaster recovery, I put everything in ansible playbooks and roles. The role also takes care of setting up hourly backups (with Borg) and will import the latest backup when setting up the service. Because backups are separate, I can delete the last couple of backups if they've been compromised.
It's obviously not as resilient as using the centralized equivalent, but it feels like a decent trade off.
> The approach in taking to self hosted services is that if it's down for an hour or even a couple of days it's fine since I'm the main user.
Well, maybe, depends on what you're depending on that service for. However, we were talking about large scale centralization v decentralization. If you took everything currently on GitHub and moved it to a bunch of different self-hosted servers, you would have a problem. Many _do_ depend on what's there on GH.
Even if it's not critical, it really sucks to want something and not be able to get it for 2 days because Jim went on vacation and his server crashed.
> For disaster recovery, I put everything in ansible playbooks and roles. The role also takes care of setting up hourly backups (with Borg) and will import the latest backup when setting up the service. Because backups are separate, I can delete the last couple of backups if they've been compromised.
> It's obviously not as resilient as using the centralized equivalent, but it feels like a decent trade off.
The issue isn't that it cannot be done, it's that centralized services mean _I_ don't have to worry about it. I don't want to be a server admin; I just want to store my code in source control.
I do the same but use Restic for incremental backups. The problem is people have minds washed by corporate marketing. Unfortunatelly they think they need multi-datacenter high availabilty, machine learning and big data DBs where only minority people really need it. :(
Not very, some things I run on small 5 or 10$ digitalocean droplets, other things on my home "server' which is an i5 4th gen with 8gb ram. The borg server is rsync.net
And this is all assuming the person has the chance to do so, especially younger people can't really self-host or seriously lack the skill, programming alone is hard for them.
I got tired of managing all that software so I just moved to containers. The Docker Hub has pretty much all I need, and if it doesn't, it's much easier to get it working than spinning a new VM or trying to install it together with a bunch of other software and keep them all up to date.
Have you tried Gitea [0]? It's incredibly easy to setup, and if you run it on Debian you can setup Unattended Upgrades [1] once then you pretty much never have to worry about it again.
Ben Thompson at Stratechery has been banging the centralization drum for a while. His take is the sheer scale of the internet imbues tremendous value to any centralized service that can organize it (e.g. what Google does for web pages or Facebook does for your friends).
If this is the case then the internet may end up more centralized than previous forms of media and communication. I hope that's wrong, but it seems to be happening right in front of us.
He created it to be “stra-tee-ka-ry” (after George W. Bush’s infamous ‘stragegery’) but no one ever pronounced it right. Over time, he admitted that “stra-tech-uh-ry” was better anyway because the site covers technology.
I think the big problem is not so much centralisation as it is that people are locked into platforms.
There's nothing wrong with using Gmail for email hosting if you have your own domain name, it's easy to then switch over. But if you use an @gmail.com email address and Google decides to ban you, you're screwed.
Same story with Facebook, there are quite a few people that I would possibly never be able to contact again if Facebook banned me or them. This isn't a new problem caused by Facebook though, in previous years there was the same problem with email, and before that, if someone changed their address or phone number you'd lose contact. There are actually people who I lost contact with who I regained contact through Facebook.
Anyway, my point is that centralisation isn't bad if it's painless to decentralise again. Git and Github are fine, at work we could switch to a self hosted solution or Bitbucket in a fairly trivial amount of time. Same for our work emails, we could switch form Gmail to another host with fairly minimal disruption. Centralisation with locked ecosystems, like Facebook, are not good.
Yeah, I wanted to say exactly this. There's a different between specialisation in a market that still provides competition and consumer liquidity, and monopolistic centralisation (even when the monopolist is effectively benign).
It's like there's a middle-ground between everyone building their own automobile, and there being only a single car manufacturer on the planet. So long as have you have a few players who are genuinely competing, and you have the option to switch between them without paying an excessive penalty, then it's a good thing that a certain subset of people are working hard on building better cars than you ever could tinkering away in your garage (although you should still be allowed to do that, if you so desire).
What this requires is for governments to provide a framework of standardisation, access-to-data, data-transferability, and antitrust enforcement that allows a healthy competitive environment to emerge and persist, without either allowing a monopoly (or a cartel) to evolve, or overburdening everyone with regulation.
It's a difficult balancing act, even in more traditional industries, and anyone who presents a too simplistic "more regulation!" or "less regulation!" answer to it isn't being honest. Tech presents additional challenges because it's newer, and because it's evolving rapidly. I think it will take many more decades for the necessary level of technical understanding to metastasise through policy-making and civil administration realms until you get effective control of tech markets, although laws like GDPR are an interesting start.
> But if you use an @gmail.com email address and Google decides to ban you, you're screwed.
You can loose control of your domain too. And if someone else snatches it up they could even read your newly incoming email. I'm willing to bet when gmail bans an account, they don't recycle it.
Anyway, my point is that centralisation isn't bad if it's painless to decentralise again.
I'd love to see GitHub (and GitLab) offer a custom domain name option which would proxy HTTPS and SSH clones via a company's custom domain (e.g. code.example.com) to GitHub's hosting. That way, if a company needs to move away from GitHub (or GitHub goes away), developer's references to some company's code is not stuck pointing to some stale github.com domain.
Hosting such a proxy (and the repos) takes infrastructure, time, and experience to secure and make available, so from a user's perspective it's great if a company does this as a service as long as you provide the domain.
Maybe we can eventually achieve something like decentralized ownership of centralized web resources.
While not quite decentralized ownership, Wikipedia is at least a non-profit that everyone contributes donations to. Where's the Wikipedia for code? Where's the non-profit donation-based social network?
> Everything, really, seems to naturally trend towards centralization,
It's economics 101, really: division of labour and economies of scale. A company like Github can build and sustain capabilities that are flatly uneconomical for their customers to build themselves.
In terms of strategy, Github's lockin doesn't come from git. It comes from everything around it: issues, pull requests etc. This functionality can be replicated, but data has inertia. The more you have, the less you want to move it.
For an instructive parallel, consider how easy it is to use AWS Lambda with all of AWS's data-centric services.
If centralisation only occurs due to government enforcement, then how did government itself come into existence? It's the purest example of centralisation, after all. There must be something that incentivised centralisation even in the absence of government.
> The whole darn internet was built to be decentralized. And yet we all use GMail, the same handful of DNS servers, the same short list of major trunk hubs, the same shrinking list of bitcoin mining pools, etc.
Eh, I have Gmail, but I barely use it.
Regarding DNS, that's only because the DNS provided by ISPs is being tampered with or runs downright awful.
What happens is people copy each other's behavior. X (friend of Y) starts using Facebook. Friend of X starts using Facebook as well. But X never looked into the alternatives (who has the time for that plus everyone is using Facebook); its only because of other people that they started using it. It is called the network effect [1] though we can thank the early adopters for feeding that hype. I see it as a sign of capitalism/optimization.
You can easily migrate your repo to a different service, so having lots of developers rely on GitHub isn't really a big deal. Many large projects have GitHub mirrors.
I'd say the biggest issue is that Git doesn't include better built-in support for issues, wikis, PRs / code reviews, and releases. Compare that to Fossil [0], which lets you bundle up everything into a single file. If there was better built-in support you could migrate everything more easily to self-hosted alternatives like Gitea [1]. Regardless, it's possible to migrate manually, even if it's a bit more work.
> Git doesn't include better built-in support for issues, wikis, PRs / code reviews, and releases
I feel like a lot of these features could be handled with git-notes[1]. For example git-appraise[2] uses the git-notes feature for a code review system:
Somewhere around 2009-2012 there was also a lot of experimenting going on with distributed issue tracking, such as with Bugs Everywhere [0] (and like 3 or 4 others I tried some time ago that I can't find now).
That said, it seems like there was a resurgence - I found two, git-issue [1] and git-dit [2], that have both had activity this year. And git-dit has shown up on here at least once before, as well [3].
Part of the challenge is that for issue tracking to really work for non-trivial projects is that there needs to be some web interface that allows newcomers to report issues without first obtaining commit access to the main repository. If some of the solutions can get this bit of functionality then they stand a chance of doing some really awesome things (I'd love to be able to work on and update issues while totally offline).
Right now they do still unfortunately come off as toy examples once you get to the point where their distributed nature would normally come in handy. It's a shame that the interest in the problem died down a few years back around the time you noted.
Thanks for sharing, I'd never heard of git-notes before! That's really cool. I'd previously wondered why there wasn't something like this available; it looks like I just wasn't aware of its existence. I definitely agree that you could implement many powerful features using this.
Now I'm wondering if someone has implement a release tracker using git-notes. After you tag the release you could build / generate the output assets (e.g. binaries), upload them to some remotes (e.g. S3, GitHub, your server), and add a note with the release info plus some links.
I've already thought up a couple ideas for things that could be built on top of git-notes.
I'm glad to see Fossil mentioned here. Yet the real problem is not in the technical aspects and possibilities of project hosting migration, it's the dependence of the majority of open-source developers on GitHub's centralized social network.
Hopefully if this turns GitHub evil, other social networks can be leveraged to assist in a population migration. Like, someone posting a link on Twitter to wherever the repo will exist in the future.
This makes me wonder how hard it is to securely host a git server on my own. I'll start googling but I bet the answer is gonna be "hard."
The only hard part is securing your server, since setting up a remote git server only requires two things:
1. Running git.
2. Providing SSH access.
But even securing the server is not necessarily difficult, depending on the level of risk acceptable for your purposes.
Put another way, reasonable security is not too hard and AWS makes reasonable security fairly easy. Even self-hosted *nix servers can be pretty secure. However to seriously harden a service against skilled and determined attackers requires an experienced team and even that may not be enough.
Point being, the data can and should live within the repo like gh-pages has been doing for a while. Not about those tools being integrated to official git tooling.
Pagure (https://pagure.io/pagure) combines a simple Git hosting forge with SIT-style issue tracking. On top of the code itself in Git, Pagure also uses Git for managing issues. So you can even use Git to comment on things. :)
Also, for those interested, Pagure lets you live a JS-free life too. ;)
I've been using it for several months on pagure.io, and it's my preferred Git-based hosting for my private stuff locally.
At my level of understanding it, the main difference between Pague and SIT is that SIT's story is more about decentralization (and SIT also provides its core to other potential applications beyond software development niche)
Well, centralisation also has a huge number of benefits, so it's not really a surprise.
The more important thing is to be in a position that makes it harder to be locked-in to a centralised provider. Fortunately Git makes that relatively easy – I could switch all of my work to Gitlab or Bitbucket with relatively little work.
There's more obviously a problem where Github is being used for issue tracking, PRs, and the general open-source community. I'm sure there will be a few scripts available to make migrating issues etc. to another provider relatively painless, since there's no great distributed solution for this at the moment. That just leaves the community aspect, which is going to be the hard bit…
I think the network effect is too great to ignore. I would guess the number of potential contributors you get just by using GitHub, where many people have an account and know the workflow/UI, is bigger than any other place.
5 years from now when GitLab is acquired by Google we'll have to migrate again.
Or export you projects to someone else running GitLab. All the functionality to run a forge in GitLab is open source and export/import is open source as well.
Which Gitlab? The software is less important than the ecosystem. Git is the major component, but Github/Gitlab are about usefully centralizing it. Gitlab is still a centralized service, even if there are N instances of centralization.
In other words, I don't know how much use it is to fork Gitlab if the community around it is dispersed. Git is already based around decentralization, and I don't see how running my own instance of Gitlab makes it any less disruptive when the most popular instance of Gitlab is disbanded.
But how do you transport your issues and CI integration and deployment? The googlecode to github transport was already a desaster.
Code is easy, it's the rest which has the value.
Issue importing rarely works. It needs a lot of related changes all over, which cannot be automated.
I edit my CI scripts almost daily, until the workflow is acceptable.
I try all the others constantly, but it's a headache and needs weeks of time until it works good enough. Github only hosts the results and releases, but still.
Just last week I spent again a full day to get the encrypted auth working for github deployment from some random new CI.
My CI's do work together with connected hooks, e.g. if appveyor and travis passes, merge to master, then appveyor deploys the binaries to github, then the release is published, which sets a tag, which triggers travis to make distcheck and deploy to the very same release on github.
On github I have static pages with custom domain names for free.
On github I can host releases and dailies for free.
Someone needs to pay for it. Better a big company which backs this.
I'm a bit sceptical with Microsoft as I'm sceptical with Google. You can never trust PM's in big companies. But so far I give them the benefit of doubt, they clearly love github (unlike google which didn't love code.google), and we'll all see.
I think we all trusted github because of git: we know that worst case was that github disappeared overnight, erasing all repositories, and in that case we all would still have distributed backups to restart it.
So we had limited trust in the centralized missions we gave it: communications, issues, pull requests.
I am sure that as soon as a good tool allows to do that in a decentralized fashion, we will switch to it.
If Git/Github is the epitome of the developer's centralised experience, then I full heartedly welcome that.
As others have mentioned, there's certainly advantages to centralised systems like Github's network graph, issues, etc. But at the end of the day, everyone on GitHub still cares a whole lot more about the code, and due to the very nature of the protocol that's something that'll always be able to be taken elsewhere. It's not like "if they announce shut down you can export your data", there's a very high lightlyhood you have already done that and you have the full repo on your local machine.
I feel like Github is the best possible compromise - centralised network and features that's built on a decentralised protocol and easily able to be taken elsewhere.
Basically Cydia is an apt manager for jailbroken iOS devices and while Cydia itself is open-source, the tools required to get tweak injection (Mobile Substrate) are not, making Cydia itself useless for newer jailbreaks released. Making Cydia not only a monopoly but centralizing the tools required to make it useful at all.
>Maybe this event will finally shift things back into a decentralized direction.
Absolutely no chance, and I'm fine with that. Sorry, but I don't feel like spinning up a slhit server and managing all that goes along with it. I just want my code hosted somewhere that others can access. These meta issues are just noise and don't bother me (or 95%+ of the people using GH) one bit.
And there it is. "Git" "hub" could be considered an oxymoron. The adoption of this proprietary closed-source veneer over git by FOSS developers has always been ironical to the point of farcical. Now the other shoe has dropped.
Good point. It's not too crazy to have people behind major open-source efforts get their own VPS's and run Phabricator or Gitlab. With a universal login akin to OAuth maybe it's doable.
That's insane, not only do developers and maintainers devote a huge part of their life to major open-source efforts, now they are expected to spend more time and money setting up, admining, and securing their own private servers? That's a great way to destroy Open Source projects. Very few people would be able/willing to put that much effort into it.
Not sure if this is an age-gap or participation-gap here but that's EXACTLY what we did before Github came along. My foray into open source started in '99 and it wasn't until 2010? that we had good centralized alternatives to switch to. Hell, jQuery was on a hosted version of Trac until just recently. And yes, we spent time on it. A great deal more than "very few" did the same and were involved and passionate about it.
Not all of us. I have been writing open source since approx. 1998, and I remember hosting my code on sourceforge, and announcing on freshmeat. There was also a few other central repositories, and some major projects having their own CVS and later Subversion repositories. Later came Google Code that I used too, circa 2004 or 2005. But I clearly remember people, myself included, looking for a code repository they could use for their source code.
I could envision pre-configured vps images maintained in the open by volunteers to make it more practical. They could have unattended-upgrades configured for major patching and the community could help migrate between major LTS releases.
It isn't ideal but it might be doable at least for some projects. Maybe devops people want new ways to contribute to open source too!
How is running a bit of infrastructure so lethal? Most of my libre projects aren't on Github, and running your own Git server isn't terribly difficult.
Running your own git server is far from impossible. But you have to take into account various items.
1) the cost of the server, you don't need a powerful one, a 5 or 10$ a month box is more than enough, but it constitute a barrier for some, not necessarily because they can't afford it, but because they don't think their projects is worth it.
2) You have to maintain it properly, applying security fixes, having backups and maybe some basic monitoring. It's not high and complex maintenance, but it's far from a fire and forget thing.
3) Contribution from external sources becomes harder, you generally don't have basic "fork" and PR buttons in the simpler hosting git options, and if you opt for something like gitlab or gogs, external contributors as to register in your instance.
4) Github is really convenient if you are searching for open source tools that does a specific task in a specific language. Having various self hosted git servers makes it harder to discover new tools.
I do run my own Gogs instance which mirrors my github repositories, it's not heavy maintenance, but it's not exactly nothing. And I'm pretty sure nobody, except bots, knows it exists.
Decentralization is a wonderful idea. It brings with it benefits that are technical, architectural, and social. Yet is also carries with it its own set of costs and risks.
Bluntly, I doubt this will shock the world of software development into being less centralized. The practical benefits of centralization are, apparently, too large to be readily overcome.
(shameless plug) I recently built a tool (SIT, https://sit.fyi) that has decentralizes issue tracking and similar workflows, giving full days ownership and independence of storage and transport. Can be (and actively being) used with Git, or other SCMs.
There's a known issue with older Firefox. Once the site is updated to the new version, this issue should be gone. Let me know if the problem persists for you on any other browsers! Thank you!
It doesn't rely on servers (it's striving to be a "true serverless" application) so server patches are perhaps taken care of with this :)
Split-brain is the "normal mode" of operations for SIT -- it's a decentralized tooling so it expects something to not be available to you until it has been received by you -- at that point it is sitting in its entirety on your computer and you don't need an internet connection.
Abusive users -- depends on a scenario and "gatekeeping" checks -- known/vouched for GPG keys can be used to enforce some policies. Admittedly, this being a new tool, this angle hasn't been worked on extensively (if at all) -- but the way SIT is architectured is to potentially allow for all kinds of scenarios thanks to its flexibility and simplicity.
I could thank you for your time, for your effort, for humoring me. Etc. etc. etc.
I'll cut to the chase.
Your answer is basically "You have to do extra work". Which is the wrong answer when trying to replace tools whose major selling is reducing the amount of maintenance work I have to do. This response, unfortunately, commits the basic error of trying to solve a social problem with a technological solution. I don't want to take on the work of guarding the gates. I don't want to guard against potential abuse. I want these things done for me, so I can focus on the parts I care about.
The ability to put in work to monitor abuse and access control in a decentralized manner is not a significant improvement over the ability to do so in a centralized manner.
Oh, and if it interacts with other computers at all then I have to worry about server maintenance. A machine is only truly serverless in isolation. Which would make your tool interesting, but potentially less than maximally useful.
This. As with so many things, laziness has driven developers to be using Github as single point of truth. And it is cringeworthy to see how few developers actually know what git is and that Github is just a company monetizing opensource software. About time everyone got a wakeup call.
To be fair, github provides a lot of value-add on top of plain git. Having UI for access control and integration with issue tracking and CI tools gives you a lot that you would not get out of the box with git.
That said, I have always been uneasy with one central player hosting a huge percentage of open source projects.
> In spite of this, we all have congregated around GitHub
Nearly all. I never got into the swing of it - always expecting a thing like this to happen sooner or later. Honestly, I shall not be sad if the world starts beating a path to somewhere else now.
Humans* aren't built for decentralization. We can only keep so many unique names in our head a time, only so many connections. I don't have any papers to link to on this, but I think it's safe to say that's how things are - if not we wouldn't have created cities, national identities, and of course, newsgroups, message boards, centralized package managers and distributions, and yes, source code repositories.
There is something nice about knowing, with a high degree of certainty, that any given open source project (or any thing) is in one of a few places, and if I don't know where that is I can ask for help to find those places by their names.
That's mixing up a whole lot of concepts. Centralization is about power.
Cities generally have very little centralization. Almost no interactions in a city go through a central entity, similarly for countries. Concentration is something different than centralization.
Newsgroups have very little centralization. Everyone reads and writes through different servers, and you can trivially switch servers without any impact on who you can communicate with. Agreeing on names and federation is something different than centralization.
...
There is absolutely no need to have software development happen on one proprietary platform in order to be able to search for software project in one centralized location. What you need for that is a search engine. Or even multiple competing search engines that can all index the same set of software projects.
Just because cities, countries, and newsgroups allow lots of different entities to participate doesn't mean they aren't centralizing. The strong incentives to place like with like result in forums with topics, subforums with subtopics. Districts become heterogeneous and we get things like Chinatown or the Castro District.
And these agglomerations of like with like occur spontaneously at pretty much every scale. You're quite literally missing the forest for the trees.
Yes, like likes to be around like, and systems that allow people to find people who are like them or who meet some of their needs are really useful, and you can call that phenomenon of "bringing people who want to interact together" "centralization". But noone ever argued aganst that.
When people talk about "centralization" in this context, they speak about power structures. A centralized power structure is when power is concentrated in the hands of a few. The fact that there are other forms of "centralization" is completely besides the point. If anything, the point is that you can often have those other forms of (useful) centralization without the centralization of power. Newsgroups, for example, obviously centralize discussion, in that they bring people who share an interest together in one (virtual) place. But they don't centralize power, because the technical implementation makes it so that no one party can easily control the communication on all of usenet.
Noone argues that we shouldn't have systems that allow us to interface with the world in a central location. This is all about avoiding concentration of power.
Your original argument is like saying that we should put all websites on the platform of a new company, say WebHub, because you want to have one place to search for websites. It's ultimately a non-sequitur: Millions of companies independently host websites on this planet, but obviously you don't visit millions of websites and ask the search function on each one of them your question of the day. You simply use one of a number of search engines, whichever one you prefer for your own needs, and no matter which search engine you choose to use, you will be searching pretty much the exact same set of websites. There is simply no need to centralize the power to control what can be published on the web or how the web works with one hosting company in order to have one central location that you can use to navtigate it all.
You don't need centralization of power in order to have centralization of access.
> You don't need centralization of power in order to have centralization of access.
It isn't necessary, but it is sufficient and that seems to be enough for a lot of folks. You seem to be arguing that I'm saying this is ideal. It's not, but it does seem to be the structure most people arrive at naturally.
> It isn't necessary, but it is sufficient and that seems to be enough for a lot of folks. You seem to be arguing that I'm saying this is ideal. It's not,
In other words: It could be even worse? Your point being?
Monarchy isn't necessary to govern a country, but it is sufficient and that seems to be enough for a lot of folks.
Does that make monarchy a good idea? Does that make democracy a bad idea? Does that mean that people have considered the advantages of democracy? Does that mean people are even aware of democracy? Does that mean anything?
> but it does seem to be the structure most people arrive at naturally.
Centralization of power? Well, if you mean by that that people who want to have power often are good at finding ways to dupe people into various forms of dependency, then I guess so? Again, your point being?
Many humans are apparently not, but I had a much easier time navigating (and contributing to!) projects when each of them had their own website with integrated scm links etc.
GitHub is a grey mass, project branding is lost, the tracker is chaotic for large projects.
It's just Sourceforge, better executed but with the same disadvantages.
I think that's quite a stretch, the reasons for decentralized version control (practical use) are far different than a decentralized internet (humanitarianism).
It’s not just about efficiency, it’s about responsibility. I love decentralization when I can avoid dealing with BigCo, I hate it when I’m responsible for fighting spam, ensuring my services stay up, not getting hacked, and so on. All of these tasks are a requirement as soon as collaboration is in the picture.
I disagree. The popularity of GitHub amongst the programming community means that ease of use / convenience is more important than the notion of “decentralization”. Otherwise, people would already be setting up their servers for hosting git repos.
That's really what I meant by efficiency. Ease of use increases that for businesses, that's the only reason they really care about ease of use of products they want to buy.
Totally agree. For me decentralized means desktop clients or own cloud. Own cloud is more expensive to maintain. Desktop app are more expensive to develop and maintain too.
If you want to have a main canonical state of the "maintained" repo, that is mutable, similar to what we currently have with Github, a blockchain would be one way to achieve that.
Then you are still missing a way to discover what the "one canonical state" of a repository is. There could be hundreds of different signed branches and versions of the repository the maintainer worked on at one point, with most of them being abandoned work.
Let's ignore for a moment the utility of Github issues, etc. and focus on why Github repos are convenient just for the code.
- You find the repo either via link or some other discovery mechanism like Google
- You do a small sanity check to see if the repo is the one you actually wanted
- You clone the repo and start using it from the default branch that was set by the maintainer -> DONE
Achieving a similar flow is not possible if you just fling out signed commits into the world. Sure, there are also different ways on how you can publish the "one canonical state", but as soon as multiple maintainers and contributors get involved, a blockchain will look more and more like a fitting solution.