Weird thing to say about the company that will be the definitive LLM once Pro 1....

staticman2 · on Feb 27, 2024

You're kind of contradicting yourself. If a high token count tells you all you need to know about how smart a model is Ultra 1.5 would be just as good as Pro 1.5.

The fact of the matter is it remains to be seen how smart either model will be.

astrange · on Feb 27, 2024

You can make up for a lot of smartness by having a smarter model generate all the text you put in that giant context window.

staticman2 · on Feb 27, 2024

Could you point me to evidence what you just wrote is true?

I just asked mistral 7b to provide 5 sentences that end with Apple. It couldn't do it. I then provided 5 examples generated from ChatGPT 4 and asked it to generate 5 more. It still couldn't do it. 10 ChatGPT examples- still couldn't do it.

You seem to be saying the models can generalize on the entire context size, that I should keep provided examples up to the token limit because this will make the model smarter. Is there evidence of that?

astrange · on Feb 27, 2024

I didn't say to provide examples. What it can do is search inside its window if you have an excessive amount of text.

It can do some extrapolation on tasks it's already been proven to do, but that's not every task.

LordDragonfang · on Feb 27, 2024

As a former google fanboy (and current genAI enthusiast that keeps being disappointed at all the models that claim to rival gpt4 and then don't even beat gpt3.5), I'll believe it when I see it.