MMLU scores are getting too high — I’m optimistic about GPQA. The weak-to-strong generalisation paper came out! It’s good work and a direction worth looking into as approaching the problem of scalable oversight from the other direction. I’m still not optimistic about it and don’t think the paper provides many updates. The OpenAI preparedness framework also came out and I’m extremely happy with it, especially with persuasion as a first-class risk. I’m excited for RSPs to be iterated on and become clearer, more specific and gain coverage. Reflections on the UK AI Summit from Kanjun. The monosemanticity paper, highly recommend reading if you can only read one interp paper.

Another good data paper, though good data papers don’t need to have big insights or takeaways. The best Gwern posts are ones where I can’t tell if he’s joking, like this one on doing a giant MLP for everything. A paper on baking in CoT processes into the model which I generally like because there’s no reason for number of tokens to correlate with how much compute it’s needed to do something. . Um. A way to test if models were trained on the test set, but requires knowing the exact strings and so isn’t that exciting. Using temperature scaling for calibration of DNNs. Understanding KL Divergence a bit better though I’ll probably never need to think about that again. Trustworthy paper on Gemini evals.

The x.ai grok announcement, which is impressive though they evaluated other models at weird temperatures. Gemini report dropped! They generally did good, though I’m annoyed by eval milking even though it’s not unfair. New Yorker profile that mentions a researcher I respect but is mostly about the model. I want to do profiles on all my favourite people but it would take time and embarrass them I guess. Incredible tweet thread about Chinese nicknames for semi-conductor manufacturing companies.

Read this book review for a book I have no intention of reading ever. I didn’t really need the financial advice, but I generally felt like exploring the documents of the great Mr. Money Moustache. The US government and their 7,500 watercolour fruits. It’s awesome that the Church of Jesus Christ of Latter Day Saints was the first website that showed me results when I was googling about this bok choy. Ant eusocial behaviour does have nanobot swarm energy. Turns out Wikipedia edit histories are a great source of funny material, I was refreshing this page on a certain news page and there are a lot of good commit messages in there. The Scientologists had a weird jazz-y band. A branch of Christianity named after milk because they drank milk at lent. Another weird guy to remind you that serial extreme lying is really common and not that hard. I should make a list of these guys for whenever someone tries to dismiss the possibility of a pathological liar.

A review on earning to give. Tom O’Donnell’s writing titled Libertarian Police Department. A thorough intro to the pharma industry that is really well-written even if you don’t care about pharma. The whole regulation thing is sort of new, and makes me skeptical of the whole “governments can’t react quickly to big technological developments” notion — it’s more like governments probably won’t react well. The better than Beatle’s effect is fun, and it’s generally nice as an AI risk-worrier to see that it’s possible to be damagingly risk averse. Have you heard about the tooth sauce that makes you never get cavities again? I’m still waiting for them to come out with an N/M dentists hate this stat, more info here. A heartfelt shutdown of omegle, morality is hard these days.

The best fucking post-mortem I’ve ever read about the Cloudflare outage. I get a bit of schadenfreude for outages just because I get excited about the post-mortem. This one was special because it was an enjoyable read, but I didn’t quite appreciate the attitude of blame to their data center manager (who is definitely getting fired) especially when they missed some good practices and were out for far longer after the data centers were back up. It reminded me to reread this old classic. I finally looked up the origin of the term “cargo-cult” and it’s really fucked, that term will never be the same to me. Imagine someone has written a hit-piece about your company which consists entirely of quotes your engineers said in the company’s engineering podcast. A cute post about forming emotional attachment with your servers.

Matt Levine really hit the mark on the whole OpenAI situation, my favourite was this day’s.

Reread Lena, I still don’t get creeped out by it. If You Are Reading This is incredibly utterly relatable. Another qntm that starts out really cute and takes a morbid but funny turn. The Difference did manage to creep me out, as did Gorge — those two are definitely my favourite of the listed qntm.

Partial Book Review: History of Western Philosophy by Bertrand Russell

I started reading this book like six months ago and am only just halfway done because I’m bad at reading. The first two hundred and seventeen pages are about Ancient Philosophy and the following two hundred and seventy pages are about Catholic Philosophy, so I thought I’d write some notes before venturing into the three hundred and forty six pages of Modern Philosophy. Russell is a fantastic philosopher, really smart and seems underrated. He inserts opinion in isolated spots, and I at least always appreciated them. I don’t really need to know anything about ancient or catholic philosophy, but here are some of the things my brain retained (which are only a little correlated with how interesting they are, but very correlated with how recently I read them):

  • Some great Catholic philosophers (like Occam) who did good thinking were perverse in that they weren’t in some truth-seeking regime, but rationalising existing beliefs. There were some goofy things, like Roger Bacon saying the first cause of ignorance is bad authority but specifying that it does not include the church. This seems rarer in this day in age where truth-seeking is more common and virtuous but I wonder where it’s happening that I’m not seeing? I feel like some research is locally like this in a good way.

  • There’s a funny event in the 14th century where the French lost the papacy when there were two popes declared (one Roman and one Avignon). The funny part is that it was resolved by just… asking a council? Cultures of absolute chain of command are good like this I guess. Russell writes about this with “therefore a power superior to a legitimate pope had to be found” and described the Avignon pope (who was closely associated with the French monarch) as “addicted to favouritism and nepotism”. It’s a surprisingly funny book.

  • Saint Augustine semi-anticipated the cogito ergo sum, he also believed that it was morally important to live in solitude and did so for a number of years I don’t remember

  • In the middle of all the boring Catholic philosophy stuff, Russell drops this, which I don’t endorse as fully accurate but is entertaining:

    Yahweh=Dialectical Materialism

    The Messiah=Marx

    The Elect=The Proletariat

    The Church=The Communist Party

    The Second Coming=The Revolution

    Hell=Punishment of the Capitalists

    The Millennium=The Communist Commonwealth

  • There was joke about how the Roman army figured out they could take bribes when choosing emperor and then assassinate the emperor and repeat

  • Aristotle is really pro-slavery, and has some weird reasoning about how some people are better off when ruled by superiors (like animals) and it justifies every single war as morally good. I only remember this because there was a funny comment from Russell: “Very satisfactory!”

  • Reading through the Ancient philosophy part was interesting as it reminds me how much standard logic and intuition I take as common sense had to be derived at one point. Simple things like how some opinions can be better than others even if you don’t know that they’re true or how knowledge is useful for morality

  • A lot of the content was context for how the works of specific philosophers were passed down, and may have been poorly documented by followers of the philosophers.

Book Review: A Very Short Introduction to Hegel by Peter Singer

I found this book at the store and I only picked it up because I saw “Peter Singer” and I was like no way! Not The Most Good You Can Do Peter Singer? It was! I was biased against the book because Hegel feels like a meme and the book opens with Singer saying that Hegel was the most “impactful” philosopher of the 20th century. Of course I thought to myself “Marx erasure?” but Singer had thought of it and followed it up with something about how because Hegel influenced Marx it doesn’t count or something.

I have never… read Hegel, but the only coherent takeaway I could pull was in regards to Hegel’s philosophy of history. The Hegelian notion of freedom actually really clicked with me, it’s what we today call “agency” (even though we use it in the definition of it’s original use) and he describes it with things like how loyalty to your nation isn’t freedom because it’s too much of a default, convenient belief. The history part is uncompelling though. Hegel describes the development of history as gain of freedom, which is true in uninteresting ways and he fails to make a coherent claim about what the end state is and claims that his 19th century Germany was it.

There was a bunch of stuff that didn’t seem fully cogent though I understand bits and pieces. Probably there is more good stuff in there? Though it seems sufficiently hard to study that I will simply not.

I have a lot of priors against Hegel. To start, Hegel is notoriously hard to understand (countless of introductory attempts) and background says I should worry that it’s hard to understand because there isn’t much to understand. Also a lot of what is attributed to Hegel is notes from his students. More bias against Hegel is that I tried to read Zizek’s jokes and he has horrible jokes, continential humour just doesn’t click with me.