Trending Post Search Feeds Browser Thread Viewer

Latest Posts by Jeff Smith

This is actually a massive result that people are really all sleeping on. We can just start over from zero and, with the right language design, build a programming language that the models can write perfectly without a training corpus.

10 hours ago 1 0 0 0

Scene at #ClawCon in London is wild. Tons of @openclaw-x.bsky.social users and devs coming together to hear the state of the claw and here direct from @steipete.me live in person. Just a packed house full of palpable, rabid enthusiasm. OSS PMF at its finest. #openclaw

1 day ago 0 0 0 0
Painting of an adorable tiny black dog, in a landscape, its front paws raised off the ground

Painting of an adorable tiny black dog, in a landscape, its front paws raised off the ground

I have nothing useful to add on the general state of things right now, but I can offer a thread of canine portraits by 18th-century French painter Jean-Jacques Bachelier, who was also director of the famed Sèvres porcelain factory. First up: Mimi, the favorite dog of Madame de Pompadour, 1762

2 days ago 319 85 6 10

In the space of an afternoon, you can see thousands of devs around the world benchmarking their current vendor's competitor, mid outage. Just one more sign of the insane pace of change with agentic AI.

2 days ago 0 0 0 0

Wild to see the ripples of @anthropic.com Claude Code intelligence rug pulls and outages ripple throughout the industry at insane pace. People are flipping back on @openaibot.bsky.social Codex, spinning up local models, testing Qwen on @openrouter.bsky.social , etc.

2 days ago 2 0 2 0
Post image

But I fear I'm far too old and slow to be this guy. I don't even *like* Red Bull.

1 week ago 0 0 0 0
Post image

I really am better suited to be this guy. I have the sweater vest and everything.

1 week ago 0 0 1 0

So, you don't feel like Ludwig when you're done. You just feel like you played as fast as you could but the game can still be played so much faster and there's probably a teenager in Korea who could mop the floor with you.

1 week ago 0 0 1 0

But with agentic coding, basically being a dev is just Starcraft with shittier graphics. You barely even notice all of the wins, because the model is solving all of the puzzles. You just wind up typing as fast as you can. It's pure video games.

1 week ago 0 0 1 0

Back in the day, being a dev used to be like being Ludwig from that David Mitchell show. You should just pace and whiteboard and ponder until you solved the puzzle. And you felt really smart when you did. It was a big part of the appeal for a lot of us.

1 week ago 1 0 1 0
Advertisement
The basic idea of hash tables is that “the universe is a big place,
but it’s mostly empty."

The basic idea of hash tables is that “the universe is a big place, but it’s mostly empty."

Hash tables, or how to leverage the sinking feeling of loneliness you get when you look at the sky

1 week ago 86 11 1 0

The new “This could have been an email” is when someone does interesting research and the only way to consume it, is on YouTube

“This could have been a blog post!”

1 week ago 15 3 2 0

Next #ATmosphereConf should be held in Europe

2 weeks ago 18 1 1 0
Preview
Against the dark forest The complex of ideas I’m going to call the Dark Internet Forest emerges from mostly insidery tech thinking, but from multiple directions.

Required reading for everyone following @kissane.myatproto.social’s awesome #AtmosphereConf keynote.

1 week ago 57 14 0 6

Loved every moment of @kissane.myatproto.social keynote at #ATMosphereConf .

I need HOLD FAST merch stat. Sweatshirt, totes, and best of all: gloves.
#KelpFacts

1 week ago 6 0 0 0

Which is why, although all John Luther Adams music will put you to sleep, all John Adams music will disturb your rest forever.

1 week ago 0 0 0 0

John Adams music is like if you put a car factory on the back of a semi. And then the driver had intrusive thoughts.

1 week ago 0 0 1 0
Preview
The toughest AI benchmark just got a whole lot tougher ARC-AGI-3 is the latest version of a clever benchmark that challenges AI models to solve mini video games with no written instructions....

For the ARC-AGI-3 benchmark test, the developers made interactive puzzle games sherwood.news/tech/the-tou...

2 weeks ago 6 2 1 0
Advertisement
Preview
Online bot traffic will exceed human traffic by 2027, Cloudflare CEO says | TechCrunch AI bots may outnumber humans online by 2027, says Cloudflare CEO Matthew Prince, as generative AI agents dramatically increase web traffic and infrastructure demands.

Online bot traffic will exceed human traffic by 2027, Cloudflare CEO says

3 weeks ago 6 2 1 0

Claude is soooo slowwwwwwww when America is awake

go back to sleep, y'all

2 weeks ago 7 1 0 0
Research Scientist, Reinforcement Learning London, UK

DeepMind's RL team is hiring a research scientist: if you're passionate about RL, come work with us!

And if you know people who might be interested, please share:
job-boards.greenhouse.io/deepmind/job...

3 weeks ago 28 14 1 0
Post image

@martin.kleppmann.com talking to @qconferences.com London about @bsky.app and #ATproto .

3 weeks ago 6 1 0 0

I'll be at #QCon London tomorrow talking about this. Come find me if you're working on open source review tooling or contributor trust. #oss #genAI #codingAI

3 weeks ago 1 0 0 0

We're also working on the cold-start problem. Scoring new contributors LOW is accurate but not useful. The next step is tooling that helps first-time contributors understand a project's expectations before they submit.

3 weeks ago 0 0 1 0

Where we're headed: contributor scoring tells you who someone is. The harder question is whether a specific PR fits the target repo. We've seen strong signal in repo-specific fingerprinting and we're building tools around it.

3 weeks ago 0 0 1 0
Preview
A Basket of Eggs Revisiting How We Score Open Source Contributors

Full writeup with methodology and data: neotenyai.substack.com/p/a-basket-o...

3 weeks ago 0 0 1 0
Advertisement

We also pulled account age out of the score and into a separate advisory. The score now means one thing. Account age is context alongside it, not blended in.

3 weeks ago 0 0 1 0

New default: one ratio. Directly interpretable. If a contributor has a 78% merge rate, that's the score. No graph construction, no regression coefficients.

3 weeks ago 0 0 1 0

That pushed us to question the scoring model. The graph score (the most complex part of the system) actively hurt predictions for unknown contributors. Merge rate alone outperforms the full model at every tier.

3 weeks ago 0 0 1 0

We tried to detect suspended GitHub accounts from behavioral signals. LLMs, network analysis, title patterns. None of it worked on contributors who'd gotten code through review. They look like everyone else. The merge process itself is the filter.

3 weeks ago 0 0 1 0