git at revision 0 worked the same way. You can see that there are no references in git at that time either. They're both copying bitkeeper, which worked the same way.
Nowadays git has references (branches), and hg has bookmarks which are the same, plus hg also has the option to label every commit with a permanent branch name. They also still have branching-by-cloning, and if you listen to Linus's original Google code talk about Git, you can see that he conflates "branch" and "clone" because that's what he originally envisioned! Even in 2007 he was still thinking in bitkeeper terms too. I bet that branching with references was Junio Hamano's idea, after Linus did the code hand-off.
I find branching-by-cloning a bit more natural in hg, because you can push to any repo. It's useful for quick, throwaway, local, easy testing out of ideas. In git, you can only push if your push doesn't modify HEAD, which typically translates into only being able to push to bare repos.
Interesting, thanks for the info. I've only been using Git since 2009 or so. I love Git's model of commits being objects in their own right, allowing you to cherry-pick them across branches, or rebase them to reorder or squash several commits together, for example.
My usual development routine is to make a ton of small commits that add up to a small set of good commits, to promote bisect-ability. I do dozens of rebases, squashes and amends when working on a topic branch. I have to use Mercurial for one of my clients, and it's a nightmare doing my development model in an SCM where I can't toss commits around willy-nilly like I can in Git.
> I have to use Mercurial for one of my clients, and it's a nightmare doing my development model in an SCM where I can't toss commits around willy-nilly like I can in Git.
Yes you can. `hg histedit` is a lot like `git rebase -i`, and `hg rebase` is like `git rebase` without -i and `hg commit --amend` is a lot like `git commit --amend`.
There are also some really cool things that we're working on with hg:
There is The Architecture of Open Source Applications series of book http://aosabook.org/en/index.html were one of the author of the software explain the essence of the program.
"A marathon of clicking 'next page,' but the view is worth it." So, this commenter practically worships git, but apparently doesn't actually understand it well enough to know a better way to find the hash of the first commit and punch that into Github. Or, it was just a joke and they got there the quick way, but still felt obliged to post a dumb joke to inflate their own ego by "leaving their mark" on git. Maybe I'm being too mean, but yeah, I also think a lot of the comments are pointless.
> Maybe I'm being too mean, but yeah, I also think a lot of the comments are pointless.
Yeah, I think you're being a little mean. If you browse to that user's GitHub page, it looks like it's just somebody new who's excited about software. Good for them.
The comments are pointless, sure, but also harmless. Similar comments might crowd out productive discussion if they were on (say) the head of the master branch, but I doubt that any serious development is happening on git's initial commit anyway. Let the new people have their fun.
As far as newbie disruptiveness goes, it could be far worse. When I was getting started with Linux, I posted this cringeworthy gem to LKML, now enshrined in the archives for all eternity: https://lkml.org/lkml/2000/10/22/69 If newbies today are merely posting "yay, git!" and "thank you!" to a secondary forum where it doesn't disrupt development, I'd say they're doing pretty well in comparison. :)
Yeah, fair enough. Good on you for linking your own cringey post. I think a lot of developers have those early cringe moments, especially if they were young when they started.
As far as disruption, it did occur to me later that somebody may be getting notification emails about these comments. But it's not too bad, as I assume they could just send the emails to /dev/null, since Github is not the official host of git. (As a tangential note, I sort of wish Github would handle this better. So many Github-mirrored projects end up with something like "don't submit pull requests or open issues here, they will be ignored" in their repo description.)
It's lots of subreddits. There are some serious ones, but the main-stream ones all contain the usual memes, injokes etc.
I enjoy diving into reddit every now and again. But I use github for work (and code for fun, although it's 'serious' fun). Although open-source collaboration is a fundamentally social activity, I think that mixing source control with a social network does inevitably leads to these kinds of comments. And I wouldn't dream of mixing that up with my professional identity.
Maybe it's just a marker of how versatile github is, and the community of people who write programs and put them in source control.
Version control systems are self-hosting when they are used to manage the primary repository of their own source code. This shows confidence because if the program breaks, then it breaks its own configuration management, which could be a headache to unravel. For example if the repository format changes, then the change has to be managed so that the old versions remain accessible through the new version of the software. If this is not managed, and old compiled binaries of the version control system disappear from existence, then it may become impossible to recover the old sources.
Thus, successfully self-hosting a version control system is some measure of evidence that the developers know what they are doing and can manage the changes. (And thus they understand change management and we can trust them to be working on version control software.)
'Hosting' means 'contain', 'serve'. A building can host a department or a convention, and a married couple can host a dinner party, with neither being required to be a webserver or programming language.
Maybe I've been drilled too hard by a couple of programming gurus, but I immediately noticed there are quite a lot of repeated yet unnamed magic constants in the (otherwise pretty clean) code. According to wikipedia [1] the rule to not use them is even one of the oldest in programming. Curious what kind of profanity Linus would come up with when confronted with this :]
This. I find that learning from original documentation tends to be much more efficient than learning from third party blogs/tutorials which try to "simplify" things, and usually do the opposite.
Does anyone know if the structure of git has changed much? I would like to read this thinking this is pretty close to the current implementation but I would have no idea. anyone?
It seems to be mostly the same, except that "Changeset" is now called "Commit" and "Current directory cache" is now called "index", but they are functionally the same.
It's actually really great to see that the model hasn't changed much (there must have been a long phase of thinking before though)
If you want to go deeper, you can check out this page:
One noteworthy difference is that in the original repository format, a tree object was just a list of named blobs. Nowadays each subdirectory of a tree is its own nested tree object, which means that when you're comparing two trees, you can skip over the directories that are identical.
I'm not sure when that change was made but it must have been very early on, because the repository format has been basically stable for many years now.
http://selenic.com/hg/rev/0#l10.1
The revlog data structure from then is still around, slightly tweaked, but essentially unchanged in almost a decade.