Hacker News .hnnew | past | comments | ask | show | jobs | submit | phainopepla2's commentslogin

> On Letterboxd, the reviews can be as entertaining as the movies themselves.

Hard disagree here. There are some good reviewers on there but you have to wade through a mountain of terrible one-liners from wannabe comedians. No matter how many of these users I block there are always more of them popping up. I wish they had a character or sentence length filter for reviews that could be toggled on, for those of us who aren't looking for a "Twitter for movies" experience.


The only solution is following/friending people who write thoughtful reviews as well as going through their friends if the number is low.

Also if you go to esoteric enough films you mostly filter out the joke reviews


My strategy has been to follow accounts that just repost reviews by well respected film critics, usually older ones like Stanley Kauffman or Pauline Kael. Typically they're well written even if I disagree with the take, and they avoid the overwriting problem that even the more thoughtful Letterboxd users are prone to (easier to write a long, rambling review than a concise one). But those accounts get taken down occasionally, usually don't have any reviews for more recent movies, and have spotty coverage even for older movies.

Absolutely ludicrous comparison

Is it, though?

I'd argue math is significantly more complex than language.


Reminiscent of when coding agents fail to fix a bug and keep digging themselves deeper into a hole.

> I understand the problem now, I just need to... > I was wrong before but the issue is now clear, let me... > I have complete clarity now! I'm going to...


Why not just use DS V4 Flash for the small stuff? Very fast and extremely cheap.

The dsv4 flash is 158B params in total. It is possible to run locally but will require all my system RAM.

Also, a lot of my day-to-day tasks perform the same on both small and bigger models: summarize a web page, draft a response, translations, quick web search, etc.


Sorry, I meant non-locally.

I'm assuming privacy is not a concern since you mentioned using Deepseek already. The cost of V4 Flash for small tasks is so minuscule as to be almost free, and you don't have to deal with a churning laptop (or even buying a high-end laptop, for someone who doesn't already have one).

I guess what I'm really asking is, what's the advantage of using these small local models if privacy isn't a concern?


I do use both DSv4 the "normal" and the flash variant, non-locally. It works well, not exceptionally. And while it's cheap, I'd say that the difference between $1 per month vs $5 per month is not a big concern to me. IMO pricing is pretty competitive among open-weight models: https://huggingface.co/inference/models

Depending on use cases, but for me I found 2 use cases where a local model is a must and not optional:

- Running offline without internet access: for example, I have this project that allow transcribe and summarize audio in real time. I already used it in some events where wifi is not available: https://github.com/ngxson/llama.cpp-realtime-audio-recap

- Handle private personal data, for example health records. This is the same category of "privacy" that you mentioned, but I just want to bring up the fact that people value their privacy differently.


dsv4 flash has 284 billion parameters, not 158 billion.

Huggingface's little parameter count badge seems unreliable.


Have any major open weight models been "open data"? Wouldn't that entail distributing vast amounts of copyrighted data?

Olmo from AllenAI has been releasing their full pipelines including data [1]. A lot of it is just repackaged and resampled dumps from copyrighted data that has long been publicly available as dumps: Common Crawl, arxiv, Wikipedia, StackExchange, reddit --- all of which are presumably copyrighted with different licenses. Go in Huggingface and you can find massive multi TB data dumps used for pre training.

It is just as legal as when Uber and AirBNB were running illegal taxis and hotels during their growth phase. I'm just waiting for some corporate IP law firm to learn about Huggingface.

[1] https://huggingface.co/datasets/allenai/dolma3_pool


It's rather off-topic at this point, but I've never understood how HF can afford to be a CDN for such huge files. It seems like enterprise customers must be subsidizing a lot, but...at that point, is there not a cheaper alternative that doesn't subsidize every hobbyist and startup around?

> how HF can afford to be a CDN for such huge files

bandwidth and storage are literally free when compared to the cost of GPU clusters. HF gets rewarded heavily on capital market for being in AI without actually doing much AI stuff, that is a huge win when compared to costs they are paying for bandwidth and storage.


> how HF can afford to be a CDN for such huge files

To be precise, Amazon Cloudfront is the CDN. Maybe they got some startup deal?

Amazon does now also have flat rate plans that are a lot cheaper.


> I'm just waiting for some corporate IP law firm to learn about Huggingface.

Presumably they already know. The issue is that IP law firms are tiny compared to the trillions of capital pouring into "AI". And if you believe the USA is a capitalist country where the side with deeper pockets win, you know you're not going to win against the trillionaires.


Why is the text field in dataset preview table populated with pornographic labels?

Because it's a random sample of the Internet?

Could be but then it means like 98% is pornography I guess, because it’s every row, so if random a bad sign!

NVIDIA's recent Nemotrons tend to be open training data and code.

Probably as a base to use by people buying NVIDIA hardware to train their own.


Nemotron is mostly open data. They only release a portions of their pre-training data. From https://docs.nvidia.com/nemotron/latest/nemotron/super3/pret...

  Open-source data coverage: The released datasets cover an estimated 8–10T tokens 
  (~40–50% of the internal 25T blend). Missing categories include code (~14% of blend),
  nemotron-cc-code (~2%), crawl++ (~2%), and academic text (~2%). Users should 
  supplement with their own data for these categories and adjust train_iters 
  accordingly.
Nemotron is the strongest model (on most benchmarks) that has its full training pipeline and most of the data open. Olmo 3 from AllenAI, and K2 Think V2 from Mohamed bin Zayed University of Artificial Intelligence are both fully open, but not as capable as the Nemotron family. Granite has much of the training pipeline and data open, but is missing some of each.

ibm granite has been open data from the beginning iirc

That's because Qwen's flagship models are not, in fact, open weight. Qwen3.7 Max, Qwen3.7 Plus and others are closed weight.

You can use Qwen3.6 35B A3B (for example) on Openrouter with a US-based ZDR provider, because it's one of their open weight models


> That's because Qwen's flagship models are not, in fact, open weight

They changed course when they fired the old lead and hired a new 1 from ex-gemini.


No, Qwen Max series has always been proprietary.

They also stopped releasing 100b+ model weights after firing him

Insurance companies spend a lot of time on bill review and disputes over coding. They will develop an adversarial AI process to counteract this (assuming they haven't already). Race to the bottom and we all win

He allowed allowed the destruction of Gaza to start, and did nothing to stop it even when it was clear what was happening and it was within his power to do so. Of course, Trump would have been no better on that front. But that brings us back to...

The lesser of two evils is still evil


So "there's no difference" becomes "any evil is the same" - that's not how this works.

The lesser of two evils is still lesser.

This is very true. My parents were organic farmers back before there was any legal or official designation, and keeping synthetic pesticides out of the soil and watershed was a bigger motivation for them than fear of eating conventionally grown food.

I think as organic food became more mainstream it also became more about an individual's health and less about larger environmental and social concerns, but those concerns are at the heart of the original organic movement.


And to top it off both Y axes have the exact same label. The graph has no indication as to which line belongs to which axis. Of course, we can easily infer which is which from the surrounding article, but still... terrible data visualization


I thought the same thing about matching the line to the axis, but the legend is marked with LHS and RHS. It's very subtle.


Ah, missed that. Well, I think the point remains that it's an abominable and misleading graph


Different scales between the left and right sides as well. The vertical distance for 500k bpd on the left is about half of the right.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: