Weeklog for Week 16: April 17 to April 23
Progress
Regular work week. No surprises, except that we still don't have internet at home, and this seems like it will persist for a few more weeks. Thank you, Deutsche Telekom!
TWIL
Jupyter now has a real-time collaborative mode. And it uses CRDTs, how awesome is that!
Articles
- Free Dolly: Introducing the World's First Open and Commercially Viable Instruction-Tuned LLM - The Databricks Blog
- PolymorphicWetware comments on Global GDP is not exponential: it's superexponential
- Jackson Greathouse Fall on Twitter: "I gave GPT-4 a budget of $100 and told it to make as much money as possible. I'm acting as its human liaison, buying anything it says to. Do you think it'll be able to make smart investments and build an online busines
- Replacing my best friends with an LLM trained on 500,000 group chat messages
- Dummy Boards: the Fun Figures of the 1600s - JSTOR Daily
- The Commission for Stopping Further Improvements
- How we made Jupyter Notebooks collaborative with Yjs -- by Kevin Jahns -- Jupyter Blog
- Introducing Jupyter Scheduler. The Open Source Jupyter team at AWS is… -- by Jason Weill -- Jupyter Blog
- Lost at SQL - SQL learning game
- Why the Brain’s Connections to the Body Are Crisscrossed -- Quanta Magazine
- Wtf is a kdf? -- blog.dataparty: This is about key derivation functions, but the actual kicker is this: the french police broke a LUKS hard drive encryption by brute-forcing the PBKDF2 step.
- rl-for-llms.md · GitHub: "With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT." -- The argument is this: examples only show the good path, and never punish the wrong path.
Libraries, programming, etc
- GitHub - allenai/mmc4: MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.
- GitHub - ron-rs/ron: Rusty Object Notation
- Inigo Quilez :: computer graphics, mathematics, shaders, fractals, demoscene and more
Books
- Dragon's Egg by Robert L. Forward: finished. I am ambivalent about the ending. On the one hand, it's a beautiful soft finish, without any conflict. On the other hand, the Cheela really take off and there's just no telling where they've gone to.
Games
- Dishonored 2 does not need an internet connection
Backlog
- Beyond Blue (free from EGS)
Recipes
- Kekse