Weeklog for Week 16: April 17 to April 23

Apr 23, 2023

Progress

Regular work week. No surprises, except that we still don't have internet at home, and this seems like it will persist for a few more weeks. Thank you, Deutsche Telekom!

TWIL

Jupyter now has a real-time collaborative mode. And it uses CRDTs, how awesome is that!

Articles

Free Dolly: Introducing the World's First Open and Commercially Viable Instruction-Tuned LLM - The Databricks Blog
PolymorphicWetware comments on Global GDP is not exponential: it's superexponential
Jackson Greathouse Fall on Twitter: "I gave GPT-4 a budget of $100 and told it to make as much money as possible. I'm acting as its human liaison, buying anything it says to. Do you think it'll be able to make smart investments and build an online busines
Replacing my best friends with an LLM trained on 500,000 group chat messages
Dummy Boards: the Fun Figures of the 1600s - JSTOR Daily
The Commission for Stopping Further Improvements
How we made Jupyter Notebooks collaborative with Yjs -- by Kevin Jahns -- Jupyter Blog
Introducing Jupyter Scheduler. The Open Source Jupyter team at AWS is… -- by Jason Weill -- Jupyter Blog
Lost at SQL - SQL learning game
Why the Brain’s Connections to the Body Are Crisscrossed -- Quanta Magazine
Wtf is a kdf? -- blog.dataparty: This is about key derivation functions, but the actual kicker is this: the french police broke a LUKS hard drive encryption by brute-forcing the PBKDF2 step.
rl-for-llms.md · GitHub: "With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT." -- The argument is this: examples only show the good path, and never punish the wrong path.

Libraries, programming, etc

Books

Dragon's Egg by Robert L. Forward: finished. I am ambivalent about the ending. On the one hand, it's a beautiful soft finish, without any conflict. On the other hand, the Cheela really take off and there's just no telling where they've gone to.

Games

Dishonored 2 does not need an internet connection

Backlog

Beyond Blue (free from EGS)

Recipes

Kekse