LLMS

All 6 entries tagged with LLMS.

2025: The year in LLMs by Simon Willison

My tools.simonwillison.net collection of HTML+JavaScript tools was mostly built this way: I would have an idea for a small project, prompt Claude Artifacts or ChatGPT or (more recently) Claude Code via their respective iPhone apps, then either copy the result and paste it into GitHub's web editor or wait for a PR to be created that I could then review and merge in Mobile Safari.

I have been doing this a lot this past year as well. Most of the site was done this way. First using cursor, then Codex and finally using Claude.

Micro
LLMSSIMON-WILLISON

2025 LLM Year in Review by Andrej Karpathy

LLMs are emerging as a new kind of intelligence, simultaneously a lot smarter than I expected and a lot dumber than I expected. In any case they are extremely useful and I don't think the industry has realized anywhere near 10% of their potential even at present capability. Meanwhile, there are so many ideas to try and conceptually the field feels wide open.

A nice read, if a little longer. Perhaps the reason why I had not gotten to it yet.

Micro
LLMSKARPATHY

Your job is to deliver code you have proven to work by Simon Willison

A computer can never be held accountable. That's your job as the human in the loop.

Almost anyone can prompt an LLM to generate a thousand-line patch and submit it for code review. That's no longer valuable. What's valuable is contributing code that is proven to work.

I liked the way Simon said it - your job is to deliver code you have proven to work.

Micro
CODELLMSAGENTIC-CODING

Three Years from GPT-3 to Gemini 3 by Ethan Mollick

Three years ago, we were impressed that a machine could write a poem about otters. Less than 1,000 days later, I am debating statistical methodology with an agent that built its own research environment. The era of the chatbot is turning into the era of the digital coworker. To be very clear, Gemini 3 isn’t perfect, and it still needs a manager who can guide and check it. But it suggests that “human in the loop” is evolving from “human who fixes AI mistakes” to “human who directs AI work.” And that may be the biggest change since the release of ChatGPT.

Google announced Gemini 3.0 which takes it closer to the state of the art with respect to other models. They claim it’s better than the rest. In this field, that’s a little subjective.

It has given me an interesting headache though. I was planning to take yearly subscription of Claude. I will test this out instead now.

Micro
GEMINIETHAN-MOLLICKGOOGLELLMS

DeepSeek may have found a new way to improve AI’s ability to remember by Caiwei Chen

Instead of storing words as tokens, its system packs written information into image form, almost as if it’s taking a picture of pages from a book. This allows the model to retain nearly the same information while using far fewer tokens, the researchers found.

It also uses older or less critical info in slightly blurred pictures.

A picture is worth a thousand words after all.

Micro
AIDEEPSEEKLLMS