More

nee1r · 2026-02-26T06:45:04 1772088304

thanks! a lot of credit to the people who helped write/edit

nee1r · 2026-02-26T06:44:35 1772088275

giving back to the research community! releasing and talking about research helps everyone

nee1r · 2026-02-26T06:43:23 1772088203

thanks! i definitely love diffusion + pushed for it, as a non-causal generative method i think its pretty unique

nee1r · 2026-02-26T06:42:35 1772088155

thanks! got a lot of inspiration from VPT https://arxiv.org/abs/2206.11795 is a great paper, would recommend a read

we all have various backgrounds, me particularly i did a lot of material science x ai research and just fundamental architecture research before

nee1r · 2026-02-26T01:51:44 1772070704

we have an alignment blog post dropping soon! scaling up in the next couple of months, then hopefully opening up an API or licensing it.

Benchmarks are really fun—lots of secret ones. Our main thesis is that you should be using the same benchmarks to measure human ability to use a computer, as you would an AI model. Definitely a suite of continuous long term planning tasks (games) and things such as marking emails as spam etc.

definitely! we are looking into more interp + visualizations in general as we scale up.

nee1r · 2026-02-26T01:19:46 1772068786

planning on instruct tuning soon!

brianjking · 2026-02-26T13:40:48 1772113248

Very exciting, great work to you and the team. Will there be any APIs available for commercial use or open source access?

What's the plan on that front?

bananzamba · 2026-02-26T12:00:30 1772107230

Awesome! That will be very interesting

nee1r · 2026-02-26T01:19:31 1772068771

safety was important for the demo, the model didn't have access to the brake or accelerator.

url00 · 2026-02-26T16:25:44 1772123144

Frankly, this comment does little to avail the parent's point - doing this on the open road was both illegal and reckless. It reflects extremely poorly on your character and the project as a whole.

nee1r · 2026-02-26T01:18:24 1772068704

thanks! the math and architecture of the FDM (no video encoder) is pretty simple, its a regular transformer with next-token predictions but with frames interleaved.

nee1r · 2026-02-26T01:17:15 1772068635

yeah! i love the BCO paper, i think its extremely intuitive and these methods are really interesting in a time where data without labels is abundant. i especially like the idea of iteratively making the inverse dynamics better—might lean closer to that in the future

cs702 · 2026-02-26T15:31:52 1772119912

> i especially like the idea of iteratively making the inverse dynamics better

Same here.

The notion of inducing these models to "hypothesize" distributions over possible actions given subsequent observed transitions makes me think of "contrastive divergence," the method Hinton and others came up with for unsupervised training of Restricted Boltzmann Machines (RBMs), in the prehistoric era of deep learning.

Given each training sample, an RBM would 1) execute a forward pass, 2) sample its output units, 3) "hypothesize" its input units, 4) execute another forward pass on the "hypothesized" input units to sample new output units, and (5) compute a type of contrastive error for local backpropagation. RMBs could be stacked, with output units from one becoming input units for the next one. Hinton called the input units "visible," and the output ones "hidden."

It's not the same, obviously, but the idea of modeling machine-generated inputs (or actions) given outputs (or transitions) has always been appealing. It has a long history.

nee1r · 2026-02-26T01:14:54 1772068494

cool thanks for the title idea!! hopefully when we scale up in the next month/two we can update the community