Yes, any operation is easy in 2^(2^n). For instance, take addition of two 128-bit numbers x and y (seen as 64-bit int arrays) on a 64-bit big-endian CPU:
So you'd read 32 bit-values into 64 bit registers, set the top 32 bits to zero, perform the addition, and then write out a 32bit value again.
It gets much worse if your CPU architecture does not support the addition to 2^(2^n); if you were to use 100 bits, you'd have to AND the values with a bitmask, and write out single bytes.
So 128 is far easier to implement, faster on many CPU architectures, plus you get the peace of mind that your code works for a long time. For instance, let's assume the lower bound of 9 months per doubling (which is unrealistic as described in this article), then you're going to hit:
Now, what's the expected lifetime of a long-term storage system? It's well-known that the US nuclear force uses 8 inch floppy disks. Those were designed around 1970. So a lifetime of roughly 50 years is to be expected. For ZFS, that would be 2054. By this (admittedly very conservative) calculation, 128 bits is only barely more than required.
The purpose of the mark() function is to make it easier to see the code for the additions in the assembly output from the compiler. Here is what "cc -S -O3" (whatever cc comes with MacOS High Sierra) produces for my 64-bit Intel Core i5 for the parts that actually do the math:
I'm not too familiar with x86-64 assembly, but I am assuming that this could be made to handle carry by changing the "addl" to whatever the 32-bit version of adding with carry is.
Taking out the (uint32_t * ) casts to turn the C code from 96-bit adding into 128-bit adding generates assembly code that only differs in that both movl instruction become movq instructions, and addl becomes addq.
So, if you were writing in C it looks like a 96-bit add would be a little uglier than a 128-bit add because of the casts but isn't slower or bigger under the hood. But note that this is assuming accessing the 96-bit number as an array of variable sized parts. It's that assumption that introduces the need for ugly casts.
If a struct is used, then there is no need for casts:
This generates the same code as the earlier version.
(I still have no idea how to handle the carry in C, or at least no idea that is not ridiculously inefficient. When I've implemented big integer libraries I've either used a type for my "digits" that is smaller than the native integer size so that I could detect a carry by a simple AND, or I've handled low level addition in assembly).
1. Accesses through pointers type-punned to something other than `(un(signed)) char` are undefined behavior.
uint64_t n = 0xdeadbeef;
uint32_t foo = (uint32_t)n; // OK
uint32_t *bar = (uint32_t*)&n; // "OK" but useless
foo = *bar; // undefined behavior!!!
uint8_t *baz = (uint8_t*)&n;
uint8_t byte = *baz; // OK, uint8_t is `unsigned char`
// Same-size integral types are OK
const volatile long long p = (const volatile long long*)&n;
const volatile long long cvll = *p; // well-defined
2. Structs are aligned to the member with the strictest alignment requirement, so a struct of a `uint64_t` and a `uint32_t` will be aligned on an 8-byte boundary, meaning its size will be 128 bits.
> Structs are aligned to the member with the strictest alignment requirement, so a struct of a `uint64_t` and a `uint32_t` will be aligned on an 8-byte boundary, meaning its size will be 128 bits.
Don't most C compilers support a pragma to control this? "#pragma pack(4)" for clang and gcc, I believe.
Given this (where I've made it add two arrays of 96-bit integers to make it easier to figure out the sizes in the assemply):
#include <stdint.h>
#pragma pack(4)
struct block_addr {
uint64_t low;
uint32_t high;
};
int sum(struct block_addr * a, struct block_addr * b, struct block_addr * c)
{
for (int i = 0; i < 8; ++i)
{
c->low = a->low + b->low;
c++->high = a++->high + b++->high;
}
return 17;
}
here is the code for the loop body, which the compiler unrolled to make it even easier to see how the structure is laid out:
Packed structs are possible, to be sure, but inhibit numerous optimizations, such as (relevant to this case) the use of vector instructions and vector registers.
Changing the loop to 4 iterations for compactness' sake, (aligned) structs of two u64s generate the following, vectorized code:
Oops, I used `#pragma pack` incorrectly in my code, but it doesn't change the codegen for the 96-bit structs other than offsets. Also `restrict` is only needed on the output argument to enable full vectorization of the 128-bit structs.
I believe this program properly handles carry from the low to high part.
The 96- and 128-bit code have the same number of instructions, but the 128-bit code has more instruction bytes due to "REX prefixes" (i.e., 32-bit register add is 3 bytes of opcode, 64-bit register add is 4)
On the other hand they could have used 96bits of block pointer and 32bits of meta data in a sort of tagged reference or capability system, instead of shuffling around a bunch of high order zero bytes forever.
Assuming the tagged reference or capability system was built, wouldn't it need software to take advantage of it? If it's not actively used, no real point having it over more block pointer space - and I doubt significant amounts of software would use such a filesystem-specific feature.
Most CPUs do not support 128-bit integer math. They would do do 64-bit integer ops with carry. In most architectures that would be no different in code size from a 64-bit op followed by a 32-bit op.
Very complex compilers and/or cisc decoders on superscaler processors could theoretically rewrite some 128-bit to 32-bit and run the computations concurrently with other 128-bit computations.
I showed them my government ID but they refused, understandably because it's in Arabic language. They asked for an English passport. But since I didn't have one, I couldn't continue the registration. It's really frustrating. Why isn't my credit card enough to verify my identity? Almost all top cloud/vps providers don't ask such questions.
Because of stolen credit cards? They are minimising potential damages like this. Malicious actors are less likely to give their ID to send spam and ID cards are harder to steal than Credit cards :)
Maybe you could get them to accept a notarized translation (Beglaubigte Übersetzung) of your ID. It's gonna add about 50€ up-front cost, so I don't know if that's worth the price or the hassle at your project scope.
We're simply making use of Python's ability to load a module from a zip file [0]. Therefore, the generation[1] is just zipping up all the files and prepending a shebang.
Might be less confusing if you append '.zip' in the first two commands:
zip --quiet youtube-dl.zip youtube_dl/*.py youtube_dl/*/*.py
zip --quiet --junk-paths youtube-dl.zip youtube_dl/__main__.py
When you echo the shebang overwriting the file, I was thrown off. I'm thinking,
"Why did you just zip all those contents into the file to just throw them out?"
Then I see the `cat` line, and it makes sense that the `zip` command appends
the .zip to the end of the file.
Sorry! The problem is that our userbase is split about wanting the playlist or the video. You can create a file ~/.config/youtube-dl.conf with the content --no-playlist so that you don't have to type it out every time.
The problem is that this only works for some YouTube videos (for example it will fail for basically all VEVO videos), not to mention maintainability issues.
I had to look up what "VEVO" was. A joint venture of several major record labels and Google launched in 2009.
Personally I have no need for "VEVO" videos. Nor do I ever encounter VEVO youtube urls posted to websites, like HN. I wonder why?
As for maintainability, I beg to differ. The raison d'etre for this script arose out of frustration that early YouTube download solutions, e.g. gawk scripts, clive, etc., kept breaking whenever something at YouTube changed. I got tired of waiting for these programs to be fixed, if that ever happened.
I can fix this 164 line script faster if YouTube changes something than waiting for a third party to fix something they developed that is far more complex. Moreover, it does not rely on Python. Is there something wrong with DIY?
I see someone posted a link in this thread to another 208 line script, yget, that uses sed and awk. This further demonstrates the relative simplicity of downloading YouTube videos.
An alternative to goofing around on the youtube.com web site, scrolling constantly and getting hit with advertising and endless lists of "related" videos is to search and retrieve youtube urls from the command line via gdata.youtube.com.
Please don't pass in -citw [0]! I have personally run youtube-dl on android, works fine. (Disclaimer though: I am the current lead developer, so may have missed a pitfall or two).
Hi, I'm the current lead developer. We update extremely frequently because our release model is different from other software; there is usually little fear of regressions (fingers crossed), and lots of tiny features (i.e. small fixes or support for new sites) that are immediately useful for our users. We've had the experience that almost all users prefer it that way, so we try to enable every reporter to get the newest version by simply updating instead of having to check out the git repository.
As @fillipo said above, there is little if any pushback from video sites. Most of the time, they update their interface (we've gotten better in anticipating minor changes) and something breaks. The recent string of YouTube breaks (for some videos, mostly music videos - general video is unaffected) is caused by the complexity of their new player system, which forces us to behave more and more like a full-fledged webbrowser. But I think we usually manage to get out a fix and a new release within a couple of hours, so after a small youtube-dl -U (Caveats do apply[0]) you should be all set again. Sorry!
I'm really grateful for this tool: both to the creator, and those that keep making sure it works. I have a swath of my life tied up in a couple of youtube playlists, and every now and then videos disappear (presumably due to DMCA-requests) -- and it's always annoying. With youtube-dl, I can simply download my lists, and then I can be confident that whatever obscure (or not) tune or cover version I found some years ago, will be in my archives (I've yet to automate this -- but my goal isn't really to archive all the things -- I generally add few songs at a time).
Anyway, if you didn't write this tool (and update it) -- I'd have to do it myself. And I'd rather not do anything myself ;-)
Thank you for all your contributions! Can you update that article to use video_id = self._match_id(url) and _VALID_URL = 'https?://...' though? We've also added a fair bit of "official" documentation at https://github.com/rg3/youtube-dl/blob/master/README.md#addi... .
It gets much worse if your CPU architecture does not support the addition to 2^(2^n); if you were to use 100 bits, you'd have to AND the values with a bitmask, and write out single bytes.
So 128 is far easier to implement, faster on many CPU architectures, plus you get the peace of mind that your code works for a long time. For instance, let's assume the lower bound of 9 months per doubling (which is unrealistic as described in this article), then you're going to hit:
Now, what's the expected lifetime of a long-term storage system? It's well-known that the US nuclear force uses 8 inch floppy disks. Those were designed around 1970. So a lifetime of roughly 50 years is to be expected. For ZFS, that would be 2054. By this (admittedly very conservative) calculation, 128 bits is only barely more than required.