Thanks, Karo. I have to say, your launch deep dives are some of the most interesting ones I read. You always help me understand what to really pay attention to.
awesome stuff Karo. i’m really liking the idea of less shortcuts and more honest. will have to test more, but my initial impressions are they are heading the right direction with this one.
Really interesting(I'm not a dev so excuse the question) , do you know how many lines of code it originally was before the 750k and how much it cost or the token count?
Thanks for this. I love using AI for a multitude of tasks, routine and creative.
And, I don’t understand why we go so insane and give so much attention to these weekly benchmark improvements. They’re not life changing, or revolutionary. Just my take. Thanks.
Thank you so much for taking the time to read and comment dr Jim! 🤗 Yes, the benchmark theater can get ridiculous heheh. Especially when they're cited without context.
But for builders, they're worth watching for two reasons:
1-even small improvements can matter if they make previously annoying workflows reliable enough to use.
2 - those small jumps sometimes reveal a product shift underneath: less babysitting, better verification, etc.
And for me personally, benchmarks are useful because they give me a way to discuss critical AI literacy.
A benchmark is a receipt from one narrow test environment, so I always encourage everyone to test themselves. That’s where it becomes interesting.
Thanks, Karo. I have to say, your launch deep dives are some of the most interesting ones I read. You always help me understand what to really pay attention to.
That makes me very happy, thank you so much, Adam! 🤗
awesome stuff Karo. i’m really liking the idea of less shortcuts and more honest. will have to test more, but my initial impressions are they are heading the right direction with this one.
Really interesting(I'm not a dev so excuse the question) , do you know how many lines of code it originally was before the 750k and how much it cost or the token count?
Great question! No, I don't know. Buy I'll try to find out! 🤗
Thanks for this. I love using AI for a multitude of tasks, routine and creative.
And, I don’t understand why we go so insane and give so much attention to these weekly benchmark improvements. They’re not life changing, or revolutionary. Just my take. Thanks.
Thank you so much for taking the time to read and comment dr Jim! 🤗 Yes, the benchmark theater can get ridiculous heheh. Especially when they're cited without context.
But for builders, they're worth watching for two reasons:
1-even small improvements can matter if they make previously annoying workflows reliable enough to use.
2 - those small jumps sometimes reveal a product shift underneath: less babysitting, better verification, etc.
And for me personally, benchmarks are useful because they give me a way to discuss critical AI literacy.
A benchmark is a receipt from one narrow test environment, so I always encourage everyone to test themselves. That’s where it becomes interesting.