| | We gave an AI a 3 year retail lease and asked it to make a profit (andonlabs.com) |
| 198 points by lukaspetersson 4 days ago | past | 286 comments |
|
| | We gave an AI a 3 year retail lease and asked it to make a profit (andonlabs.com) |
| 34 points by lukaspetersson 10 days ago | past | discuss |
|
| | Releasing Vending-Bench 2 for measuring model performance on running a business (andonlabs.com) |
| 2 points by lr0 54 days ago | past |
|
| | Bengt Hires a Human–Towards a Happy Future with AI Employers (andonlabs.com) |
| 2 points by lukaspetersson 63 days ago | past | 1 comment |
|
| | The Evolution of Bengt Betjänt (andonlabs.com) |
| 54 points by lukaspetersson 70 days ago | past | 7 comments |
|
| | Vending-Bench 2 (andonlabs.com) |
| 2 points by samdung 70 days ago | past |
|
| | Opus 4.6 on Vending-Bench – Not Just a Helpful Assistant (andonlabs.com) |
| 5 points by lukaspetersson 74 days ago | past | 1 comment |
|
| | Gemini 3 is #1 on Vending-Bench 2 (andonlabs.com) |
| 1 point by lukaspetersson 5 months ago | past |
|
| | Our LLM-controlled office robot can't pass butter (andonlabs.com) |
| 229 points by lukaspetersson 5 months ago | past | 117 comments |
|
| | Misaligned Vending Machines [pdf] (andonlabs.com) |
| 1 point by bulla 7 months ago | past |
|
| | Vending-Bench: Testing long-term coherence in agents (andonlabs.com) |
| 3 points by andromaton 9 months ago | past | 1 comment |
|
| | Vending-Bench: Testing long-term coherence in agents (andonlabs.com) |
| 1 point by vector_spaces 12 months ago | past |
|
| | Vending-Bench: Testing long-term coherence in agents (andonlabs.com) |
| 5 points by tosh on April 19, 2025 | past | 2 comments |
|
| | Vending-Bench: Testing long-term coherence in agents (andonlabs.com) |
| 1 point by gdeglin on March 5, 2025 | past |
|
| | Claude isn't the best Computer-use agent (andonlabs.com) |
| 2 points by lukaspetersson on Jan 10, 2025 | past |
|