More

Lapha · on Jan 30, 2024

Running models locally using GPU inference shouldn't be too bad as the biggest impact in terms of performance is ram/vram bandwidth rather than compute. Some rough power figures for a dual AMD GPU setup (24gb vram total) on a 5950x (base power usage of around 100w) using llama.cpp (i.e., a ChatGPT style interface, not Copilot):

46b Mixtral q4 (26.5 gb required) with around 75% in vram: 15 tokens/s - 300w at the wall, nvtop reporting GPU power usage of 70w/30w, 0.37kWh

46b Mixtral q2 (16.1 gb required) with 100% in vram: 30 tokens/s - 350w, nvtop 150w/50w, 0.21kWh.

Same test with 0% in vram: 7 tokens/s - 250w, 0.65kWh

7b Mistral q8 (7.2gb required) with 100% in vram: 45 tokens/s - 300w, nvtop 170w, 0.12kWh

The kWh figures are an estimate for generating 64k tokens (around 35 minutes at 30 tokens/s), it's not an ideal estimate as it only assumes generation and ignores the overhead of prompt processing or having longer contexts in general.

The power usage essentially mirrors token generation speed, which shouldn't be too surprising. The more of the model you can load into fast vram the faster tokens will generate and the less power you'll use for the same amount of tokens generated. Also note that I'm using mid and low tier AMD cards, with the mid tier card being used for the 7b test. If you have an Nvidia card with fast memory bandwidth (i.e., a 3090/4090), or an Apple ARM Ultra, you're going to see in the region of 60 tokens/s for the 7b model. With a mid range Nvidia card (any of the 4070s), or an Apple ARM Max, you can probably expect similar performance on 7b models (45 t/s or so). Apple ARM probably wins purely on total power usage, but you're also going to be paying an arm and a leg for a 64gb model which is the minimum you'd want to run medium/large sized models with reasonable quants (46b Mixtral at q6/8, or 70b at q6), but with the rate models are advancing you may be able to get away with 32gb (Mixtral at q4/6, 34b at q6, 70b at q3).

I'm not sure how many tokens a Copilot style interface is going to churn though but it's probably in the same ballpark. A reasonable figure for either interface at the high end is probably a kWh a day, and even in expensive regions like Europe it's probably no more than $15/mo. The actual cost comparison then becomes a little complicated, spending $1500 on 2 3090s for 48gb of fast vram isn't going to make sense for most people, similarly making do with whatever cards you can get your hands on so long as they have a reasonable amount of vram probably isn't going to pay off in the long run. It also depends on the size of the model you want to use and what amount of quantisation you're willing to put up with, current 34b models or Mixtral at reasonable quants (q4 at least) should be comparable to ChatGPT 3.5, future local models may end up getting better performance (either in terms of generation speed or how smart they are) but ChatGPT 5 may blow everything we have now out of the water. It seems far too early to make purchasing decisions based on what may happen, but most people should be able to run 7b/13b and maybe up to 34/46b models with what they have and not break the bank when it comes time to pay the power bill.

Lapha · on Jan 28, 2024

Well, he absolutely should have been using Vim, he instead spent the week learning vi.

Lapha · on Dec 19, 2023

If it were just a standalone tool it wouldn't affect the licencing of the project, but it's not a tool, it's a software library, so projects using the library need to follow the terms of the GPL.

The tricky part here is that it's a library that produces graphs which themselves aren't covered by the terms of the GPL. The intent of the tool seems to be that you use it to produce graphs during development for benchmarking or during research or whatever where the tool is internal but the graphs aren't necessarily internal, in which case the licence of the library is mostly irrelevant as the GPL is mostly concerned with distributing software to other users of said software, not the output of the software, but if you later choose to want to release that tool it really aught to follow the terms of the GPL and be under a licence compatible with the GPL, so the note about it being a GPL licenced library is completely valid. You should use a more permissive LGPL/MIT/BSD/whatever licenced library like benchit if you don't want to have to think about this sort of stuff.

kristjansson · on Dec 19, 2023

> if you later choose to want to release that tool it really aught to follow the terms of the GPL and be under a licence compatible with the GPL

This may be in the spirit of Free-as-in-freedom software, but I don't think it's grounded in the reality of copyright and the GPL. With the usual disclaimer that IANAL: copyright attaches to the work (the code), and is retained by the author. The author grants you certain rights to use, modify, and redistribute the code subject to conditions (the GPL). Those constraints only encumber your works that are derivative of the GPL-licensed code i.e. either direct modifications of original source or linking original/derived object code.

A product that directly and necessarily imported perfplot would probably itself need to be GPL licensed to be distributed. A product developed with merely the assistance of something like perfplot would not need to be licensed in any particular way, any more than software written in Emacs needs to be GPL licensed.

cozzyd · on Dec 19, 2023

As far as I understand, I don't think even importing is an issue, unless you are distributing the module too as a combined artifact (e.g. docker image). See e.g. https://github.com/pyFFTW/pyFFTW/issues/229

There's a lot of GPL FUD though from commercial interests though.

kristjansson · on Dec 19, 2023

Yeah, there's probably some untested boundaries at the intersection of the GPL, interpreted languages, and their libraries.

Those boundaries are nowhere close to the situation in the OP though. I guess the FUD is working...

Lapha · on Dec 19, 2023

Note that the author of the unauthorised derivative work 'B', based on the 'original' work 'A', still retains copyright over their own (unauthorised) work B. Even if a court compels the author of B to stop distributing their infringing work B this fact doesn't change, B doesn't go out of copyright, and the author of A (or anybody else for that matter) doesn't have any additional rights to the use the work B.

In this particular case the Guardian article doesn't answer the question, but it's almost certainly the case that the courts found that Tolkien's estate didn't infringe upon Polychron's unauthorised work. In the UK (and much of the world), the bar for creating an original work is incredibly low, currently it's the EU approach where the author needs to express (creative) originality through choice, selection, or arrangement of a work. Previously the UK had the 'sweat of the brow' rule where the bar was even lower, only requiring 'labour, skill or judgement' in producing a work. The UK also explicitly protects derivative works from being considered infringement if they're found to be a parody, caricature, or pastiche, but this obviously isn't the case here.

Also note that 'copyright service' is a private company that sells copyright protection as a service through registering a work. The UK, unlike some countries, has no register of copyrighted works so you don't need to register a work for it to receive copyright protection in the UK.

Lapha · on Nov 19, 2023

>as open source really does only mean source available

The definition and history of the term as a licence is unambiguous in that the only restriction on redistribution is that it contains the source code under the same licence. There are senior engineers alive today that weren't even born when this was the commonly understood meaning of the term, it's not a new concept.

The term and usage is being co-opted these days but that's bound to happen when it's not a legally protected definition. Give it another 10-20 years and I'm sure we'll be having the same argument over whatever term ends up replacing 'open source'.

yieldcrv · on Nov 19, 2023

I’m all for “language evolving”

Language exists to convey a shared concept

In this case, the evolved version of open source fails to convey a shared concept in comparison to the prior term, “free and open source software” or FOSS for a shorthand adjective

Here, people with knowledge of the lexicon are using it accurately, and people without knowledge of the lexicon or its etymology are complaining when they should be pushing for FOSS instead of getting surprised everytime

Lapha · on Nov 19, 2023

'Free' software has problems with its name too. The ones muddying the waters are people and companies releasing source code with a proprietary licence while trying to latch onto the open source branding.

ekianjo · on Nov 23, 2023

There is something called FOSS as well and Libre Software if you prefer a clearer term that does not imply free as beer

Lapha · on Oct 1, 2023

The topic of human vision and perception is complex enough that I very much doubt it's scientists who are making the claim that we can't perceive anything higher than 30-60fps. There's various other effects like fluidity of motion, the flicker fusion threshold, persistence of vision, and strobing effects (think phantom array/wagon wheel effects), etc, which all have different answers. For example, the flicker fusion threshold can be as high as 500hz[0], similarly strobing effects like dragging your mouse across the screen are still perceivable on 144hz+ and supposedly 480hz monitors.

As far as perceiving images goes, there's a study at [1] which shows people can reliably identify images shown on screen for 13ms (75hz, the refresh rate of the monitor they were using). That is, subjects were shown a sequence of 6-12 distinct images 13ms apart and were still able to reliably able to identify a particular image in that sequence. What's noteworthy is this study is commonly cited for the claims that humans can only perceive 30-60fps, despite the study addressing a completely separate issue to perception of framerates, and is a massive improvement over previous studies which show figures as high as 80-100ms, which seems like a believable figure if they were using a similar or worse methodology. I can easily see this and similar studies being the source of the claims that people can only process 10-13 images a second, or perceive 30-60 fps, if science 'journalists' are lazily plugging something like 1000/80 into a calculator without having read the study.

There's also the old claim [2] from at least 2001 that the USAF studied fighter pilots and found that they can identify planes shown on screen for 4.5ms, 1/220th of a second, 1/225th of a second, or various other figures, but I can't find the source for this and I'm sure it's more of an urban legend that circulated gaming forums in the early 2000s than anything. If it was an actual study I'm almost certain perception of vision played a role in this, something the study at [1] avoids entirely.

[0] 'Humans perceive flicker artifacts at 500 Hz' https://pubmed.ncbi.nlm.nih.gov/25644611/

[1] 'Detecting meaning in RSVP at 13 ms per picture' https://dspace.mit.edu/bitstream/handle/1721.1/107157/13414_...

[2] http://amo.net/nt/02-21-01fps.html

Lapha · on Sept 29, 2023

While I'm sure the publicly available and already moderated Reddit comments OpenAI pays workers to classify are truly disturbing, nobody should be forced to read Reddit after all, I think the moderators at Facebook, Google, porn websites, etc, that have to deal with moderating graphic videos of extreme violence, torture, gore, abuse, sexual abuse, child sexual abuse, etc, are the ones that have to look at the worst stuff on the planet.

AndrewKemendo · on Sept 29, 2023

They both use the same type of outsourced labeling services:

https://www.sama.com/

Lapha · on Aug 31, 2023

>UK plug sockets are rated at 13 A, giving a maximum power rating of 2,990 W.

Nominally, anyway. When the EU standardised mains voltages they mostly did so on paper by fudging the tolerances, UK's 240v +/- 3% became 230v -6/+10%. 28 years later and I still see the mains voltage in the 240-250v range more often than not.

Lapha · on Aug 20, 2023

Ideally it's lifecycle carbon footprint averaged over expected power production during that time, but the numbers aren't always directly comparable and the error bars are huge to boot.

With solar, most of the carbon footprint comes from the massive energy requirements needed to produce the panels, and countries that use coal to produce panels like China have a significantly higher carbon footprint than say the EU. Then there's the matter of where you install them, panels in most of Europe will have a CO2/kWh figure similar to to panels installed in north-west Canada, while panels installed in the US will have a similar figure to panels installed in southern Europe and northern Africa. Newer panels generally have a longer lifespan and are more efficient so will have a lower number even if the total footprint stays the same. Should the albedo effect be considered with solar? It matters if you plan to cover a light desert with dark panels. How often it rains can also ironically have a difference, as rain helps clean the panels keeping them running efficiently.

With nuclear, there's the mostly fixed costs of constructing and decommissioning the plants, the ongoing cost of running and maintaining the plants, disposing of spent fuel, and with most of the cost coming from mining the fuel. To me nuclear seems a little more straight forward to calculate, but there's still variability that can come from the availability and difficulty of mining the ore. There's also political issues to consider. If your country decides to shut down your nuclear plants prematurely, like Germany did, the huge upfront cost of building the plants can't be recouped by running them for an additional 10, 20, 30+ years until it becomes necessary to decommission them.

Regardless of how renewables are compared to nuclear, coal and gas are the elephants in the room when it comes to CO2 produced per kWh.