Reviews

2022-M10

Two months ago, I decided to quit my company and dedicate myself full-force at AI safety. The problems I had been working on were not inspiring me, and the actual work left me feeling like my brain was shrinking. Something had to change.

So far, this feels like one of the best decisions I've ever made.

I received an FTX future fund regrant for six months to transition to research. My plan for this period rests on three pillars: (1) technical upskilling in ML, (2) theoretical upskilling in AI safety, and (3) networking/community outreach.

Concretely, my plan is to (1) read lots of textbooks and follow online courses, (2) read lots of alignment forum and go through curricula (like Richard Ngo's AGI Safety Fundamentals and Dan Hendrycks's Intro to ML Safety), and (3) travel to events, apply to different fellowships, and complete small research projects.

A month and a half has gone by since I really started, which turns to be quite a lot of time. Enough that it's a good moment for a progress report and forecast.

Technical Upskilling in ML

Textbooks

  • Mathematics for Machine Learning by Deisenroth, Faisal, and Ong (2020).
    • This is a wonderful book. Clear, concise writing. Excellent visuals (color-coded with the corresponding formulas!). It hints at what Chris Olah might be able to do with the textbook genre if he got his hands on it.
    • I've completed up to chapter 9 (that's the first half plus one chapter of the second half). I'll finish the book this month.
  • Pattern Recognition and Machine Learning by Bishop (2006).
    • This book is… okay. Sometimes. It leaves very much to be desired on the visualizing front, and in retrospect, I probably wouldn't recommend to it others. But it does provide a strong probabilistic supplement to a wider ML curriculum.
    • I've done up to chapter 5 and skipped ahead to do chapter 9. I plan to go through the rest of the book for completeness. Even if many methods are not immediately relevant to the DL paradigm, a broad basis in statistics and probability theory certainly is. I'm most looking forward to the chapters on causal models (8), sampling techniques (11) and hidden Markov models (13). This should be done by mid-December.
  • Cracking the Coding Interview by McDowell (2015)
    • The widespread goodharting of leetcode is one of many reasons I'm afraid of AI. We just have to deal with it.
    • I've completed chapters 1-7, with 10(-ish) to go. I'm aiming to be done with this by January.

I couldn't help myself and got some more textbooks. When I finish MML, I'll move on to Sutton and Barto's Reinforcement Learning. In December, I'll start on to Russell and Norvig's Artificial Intelligence: A Modern Approach. Now that I think about it, I should probably throw Goodfellow's Deep Learning in the mix.

Courses

  • Practical Deep Learning for Coders by Fast AI
    • I began following this course but was disappointed by it, mostly because its level was too basic, and its methods were too applied. So I stopped following the course.
  • ARENA Virtual
    • Two weeks, a friend introduced me to ARENA Virtual, and I jumped on the opportunity. This program follows a curriculum based on Jacob Hilton's Curriculum, and it's much more my cup of tea. It assumes prior experience, goes much deeper, and is significantly higher-paced. It's also super motivating to work with others.
    • This goes until late December.

Once ARENA is done, I might pick and choose from other online courses like OpenAI's Spinning Up, NYU's Deep Learning, etc. But I don't expect this to be necessary anymore, and it may even be counterproductive. ARENA + textbooks is likely to be enough to learn what I need. Any extra time can probably best go towards actual projects.

Theoretical Upskilling in AI Safety

Courses

  • AGI Safety Fundamentals by Richard Ngo
    • I'm going through this on my own and reading everything (the basics + supplementary material). I'm currently on week 7 of 8, so I'll finish this month.
  • Intro to ML Safety by Dan Hendrycks
    • As soon as I finish AGISF, I'll move on to this course.

Once I'm done with Intro to ML Safety, I'll go on to work through AGI Safety 201. In the meantime, I've also gone through lots of miscellaneous sequences: Value Learning, Embedded Agency, Iterated Amplification, Risks from Learned Optimization, Shard Theory, Intro to Brain-Like-AGI Safety, Basic Foundations for Agent Models, etc. I'm also working my way through AXRP and The Inside View for an informal understanding of various researchers.

Over the last two months, I've actually found myself becoming less doomer and developing longer timelines.1 In terms of where I see myself ending up: it's still interpretability with an uptick in interest for brain-flavored approaches (Shard Theory, Steven Byrnes). I picked up Evolutionary Psychology by David Buss and might pick up a neuroscience textbook one of these days. My ideal fit is still probably Anthropic.

Network & Outreach

Programs

  • SERIMATS. The essay prompts were wonderful practice in honing my intuitions and clarifying my stance. I think my odds are good of getting in, and that this is the highest value thing I can currently do to speed up my transition into AI safety. The main downside is that SERIMATS includes an in-person component that will be in the Bay starting in January. That's sooner than I would move in an ideal world. But then I guess an ideal world has solved alignment. πŸ€·β€β™‚οΈ
  • REMIX (by Redwood). I'll be applying this week. This seems as good an opportunity as SERIMATS.

I received the advice to apply more often. To already send off applications to Anthropic, Redwood, etc. I think the attitude is right, but my current approach already sufficient. Let's check in when we hear back from these programs.

Research

  • I've also put together a research agenda (email me if you want the link). In it, I've begun dissecting how the research I did during my masters on toy models from theoretical neuroscience could inform novel research directions for interpretability and alignment. I'm starting a few smaller experiments to better understand the path-dependence of training.
  • I've also started a collaboration with Diego Dorn to review the literature on representation learning and how to measure distance/similarity between different trained models.

I've decided to hold off on publishing what I've written up in my research agenda until I have more results. Some of the experiments are really low-hanging fruit, yet helpful to ground the ideas, so I figure it's better to wait a little and immediately provide the necessary context.

Networking

  • I attended an AI Safety retreat organized by EA NL, which was not only lots of fun, but introduced me to lots of awesome people.
  • I'll be attending EAGxRotterdam next week, and EAGxBerkeley in December. Even more awesome people coming soon.

Miscellaneous

  • As a final note, I'm working with Hoog on a video about AI safety. It's going to be excellent.

Footnotes

  1. More on why in a future post. ↩

2022-Q3

A lot has changed for me in the past month. My partner and I decided to close the business we had started together, and I've thrown myself full-force at AI safety.

We weren't seeing the traction we needed, I was nearing the edge of burnout (web development is not the thing for me1), and, at the end of the day, I did not care enough about our users. It's hard to stay motivated to help a few patients today when you think there's a considerable risk that the world might end tomorrow. And I think the world might end soon β€” not tomorrow, but more likely than not in the next few decades.2 At some point, I reached a point where I could no longer look away, and I had to do something.

So I reached out to the 80,000 hours team, who connected me to people studying AI safety in my area, and helped me apply to the FTX Future Fund Regranting Program for a six-month upskilling grant to receive $25,000 for kickstarting my transition to AI.

Now, I'm not a novice (my Bachelors and Masters theses applied techniques from statistical physics to understand neural networks), but I could definitely use the time to refresh & catch up on the latest techniques. A year is a long time in AI.

Next to "upskilling" in ML proper, I need the time to dive deep into AI safety: there's overlap with the conventional ML literature, but there's also a lot of unfamiliar material.

Finally, I need time to brush up my CV and prepare to apply to AI labs and research groups. My current guess is that I'll be best-suited to empirical/interpretability research, which I think is likely to be compute-constrained. Thus, working at a larger lab is crucial. That's not to mention the benefits of working alongside people smarter than you are. Unfortunately (for me), the field is competitive, and a "gap year" in an unrelated field after your masters is likely to be perceived as a weakness. There's a signaling game at hand, and it's play or be played. To sum, spending time on intangibles like "networking" and tangibles like "publications"3 will be a must.

To keep myself focused throughout the next half year, I'll be keeping track of my goals and progress here. To start, let's take a look at my current plan for the next half year.

Learning Plan

Like all good plans, this plan consists of three parts:

  1. Mathematics/Theory of ML
  2. Implementation/Practice of ML
  3. AI Safety

There's also an overarching theme of "community-building" (i.e., attending EAGs and other events in the space) and of "publishing".

Resources

Textbooks

  • Mathematics for Machine Learning by Deisenroth, Faisal, and Ong (2020).
    • I was told that this book is predominantly important for its first half, but I'm ready to consume it in full.
  • Pattern Recognition and Machine Learning by Bishop (2006)
    • I was advised to focus on chapter 1-5 and 9, but I'm aiming to at least skim the entirety.
  • Cracking the Coding Interview by McDowell (2015)
    • One specification I'm going to have to game is the interview. I'm also taking this as an opportunity to master Rust, as I think having a solid understanding of low-level systems programming is going to be an important enabler when working with large models.

ML/DL Courses

There are a bunch more, but these are the only ones I'm currently committing to finishing. The rest can serve as supplementary material after.

AI Safety Courses

Miscellaneous

Publishing

I'm not particularly concerned about publishing to prestigious journals, but getting content out there will definitely help. Most immediately, I'm aiming to convert / upgrade my Masters thesis to an AI Safety/Interpretability audience. I'm intrigued by the possibility that perspectives like the Lyapunov spectrum can help us enforce constraints like "forgetfulness" (which may be a stronger condition than myopia), analyze the path-dependence of training, and detect sensitivity to adversarial attacks / improbable inputs, that random matrix theory might offer novel ways to analyze the dynamics of training, and, more generally, that statistical physics is an un(der)tapped source of interpretability insight.

In some of these cases, I think it's likely that I can come to original results within the next half year. I'm going to avoid overcommitting to any particular direction just yet, as I'm sure my questions will get sharper with my depth in the field.

Next to this, I'm reaching out to several researchers in the field and offering myself up as a research monkey. I trust that insiders will have better ideas than I can form as of yet, but not enough resources to execute (in particular, I'm talking about PhD students), and that if I make myself useful, karma will follow.

Timeline

Over the next three months, my priority is input β€” to complete the textbooks and courses mentioned above (which means taking notes, making flashcards, doing exercises). Over the subsequent three months, my priority is output β€” to publish & apply.

Of course, this is simplifying; research is a continuous process: I'll start to produce output before the next three months is up & I'll continue to absorb lots of input when the three months is up. Still, heuristics are useful.

I'll be checking in here on a monthly basis β€” reviewing my progress over the previous month & updating my goals for the next month. Let's get the show off the road.

Month 1 (October)

Highlights

  • Finish part 1 of MML
  • Finish chapters 1-5 of PRML
  • Finish intro & chapters 1-5 of Cracking the Coding Interview
  • Finish lessons 1-5 of Practical Deep Learning for Coders
  • Finish weeks 1-5 of AGI Safety Fundamentals
  • β–ˆβ–ˆ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β–ˆβ–ˆβ–ˆ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ

Footnotes

  1. At least not as a full-time occupation. I like creating things, but I also like actually using my brain, and too much of web development is mindless twiddling (even post-Copilot). ↩

  2. More on why I think this soon. ↩

  3. Whether in formal journals or informal blogs. ↩

  4. I'm including less formal / "easier" sources because I need some fallback fodder (for when my brain can no longer handle the harder stuff) that isn't Twitter or Hacker News. ↩

2022-Q2

I fell off the wagon.

This was supposed to be the year of the quantified self. I set out to track every minute of my time across several dozen goals in thirteen categories.

The effort began strong. For three months I was off of social media, exercising super consistently, timing mostly everything, and on track towards my goals.

Then... something happened.

Between April and July β€” the tracking vanished. I returned to my vices, and if anything, I became more distractible than I've been in ages. Not a second of tracking, and all hopes of achieving my goals in the trash.

What happened?

My leading theory is that it was moving-related. In February, I moved to BrasΓ­lia for two months. In April, I moved back to the US, and by June, I was back in the Netherlands.

BrasΓ­lia was in many ways a delight: great weather, amazing fruit, a sauna and gym two minutes from my door, and everything extremely affordable. Other things were less than great: the internet speeds (at least at first), my workstation (I appropriated a low-res TV for a monitor on a minuscule kitchen table), no AC.

These things seem minor, but they add up over time. If it takes 60 seconds to install a new package, you open up Hacker News and end up wasting five minutes. Each five session chips at your attentional capacity. The frustration builds and burns you out. DX matters.

Still, mostly I was on track.

What probably caused the discipline to falter was the disruption of coming back. Moves are great opportunities to change behaviors, but this works in either direction, and the asymmetry of habit formation means you have to be extra careful.

When you're moving a bunch in a short period of time, you have to be even more careful because 2 Areas/3 Notes/3 Sciences/9 Psychology & Psychiatry/Ego depletion1 comes into play. You exhaust your willpower and become more susceptible to developing bad habits with each successive move.

Seasoned digital nomads probably have their tricks to get around this, but that's not me yet. Lesson learned.

A few other problems at play:

  • My Obsidian has again become a disordered wreck. I keep on trying to impose fragile top-down hierarchies on the notes, and it ends up breaking everything.
  • In a related vein, I've come to the conclusion that using Obsidian for both task management and knowledge management is bad practice. Tasks should vanish when done. If they linger around they'll muck up your access to the more important persistent knowledge. I've moved to trying out Linear instead.
  • Tracking was much too manual. I was tracking in Obsidian, which proved too unstructured (similar concern to "not using Obsidian for task management"), so I moved on to Google Sheets, which is a nightmare (as you know). This time, I'm going to give Airtable a shot (which takes inspiration & validation from the professionals).
  • At the start of the year, I redesigned my website, because using 3 Resources/Rationalia/scripts/node_modules/lodash/next.js for a static site was overkill, but then I went too far in the opposite direction (towards raw, uncut html). The problem with this is that regularly publishing is the best for me to orient my review and tracking processes. When that output process becomes too unergonomic, it clogs up the rest of the pipeline. I'm now using Astro with a custom, simplified export pipeline (a successor to my previous solution & a set of plugins to recreate Obsidian-flavored markdown in the Unified.js ecosystem). This isn't public yet, but it will be when I've ironed out the kinks.

Whatever the reason, it's in the past, and every day is a chance to start fresh.

Let's try again

We've still got basically half a year left. What can we recover?

  1. πŸ›‘ No more scrolling (YouTube, Reddit, Porn, etc.):
    • Right. That failed miserably β€” I even ended up caving and finally getting on Twitter. πŸ€·β€β™‚οΈ I still agree with the intent but can't deny the value of being up to date with Hacker News, tech Twitter, and edu-Youtube.
    • My idea of a solution was using Inoreader. There, the problem was that it was filling much too quickly and that much of it was low quality. This time around I need to be more diligent in removing feeds that don't serve me.
    • I'm going to try again. This time, I'll allow myself a dash of Hacker News a day β€” call it part of the job requirements. Youtube, I'll get through Inoreader, and the rest, hopefully never. (I may relax this further, and allow myself some maximum amount of time per day on these trash platforms.)
  2. πŸšͺ Screen time:
    • I'm scrapping the limit for desktop (because programming is my job).
    • The main obstacle to actually tracking this was that I was manually copying the information every week. This is a perfect opportunity for automation (there is fortunately an API, but you have to call it on device). Until I get access to the data, I'm not going to require myself to track this, but the goal stands: Less than an hour a day as a baseline; less than two as a stretch.
  3. ⏲ Self-monitoring:
    • Toggl was easy and intuitive; my main obstacle was that I had defined too many different projects and types of tasks.
    • Time to simplify: Only three projects (work, personal, misc). Only a handful of allowed labels: programming, reading, watching, wasting time, organizing, meeting, writing.
    • Also, no more manual copying stuff over. I'm a programmer and should know better. Same for Apple Health information about exercise.
  4. πŸ“š Books (1 book per week):
    • I'm a bit behind schedule. 19 books in fact. But that's ok, there's plenty of time to catch up. That said, I am scrapping all of the specific goals like read X books in French, Y by this author. I'll just read what I want to read.
  5. πŸ—ƒ PKM:
    • I'm going to remove all specific goals and just commit to regularly maintenance.
  6. ✍️ Writing:
    • I haven't been writing, but I have plenty of room to catch up with my goal of 6 articles.
    • I missed M4-M7 & Q1, but whatever. For the rest of the year, I'm forbidding myself from including any quantitative result in my reviews that I haven't automated.
  7. πŸ—£ Languages:
    • I'm scrapping this goal. It was too ambitious from the start. I do want to catch up again, but I have one or two tools I want to finish up before I actually start learning Chinese.
    • My main goal in this category is to just catch up on Anki again & to have no overdue cards in any of my principal decks (General, French, Portuguese, Dutch). Stretch goal if I can work in Italian and German.
  8. πŸƒ Moving
    • Subjectively, I'm happy enough with my movement. I'm going to avoid setting quantitative goals until I've automated the information capture.
  9. 🍽 Fasting
    • When I fell off the productivity wagon, I also fell off the IF wagon for the first time in 5 years (but I'm back again).
    • This has fallen by the wayside but it's totally recoverable. I'm going to start committing to one day (Monday) a week for the rest of the year.
  10. 🌏 Diet:
    • I was tracking meat & alcohol consumption. In hindsight, it required a bit too much input. I'm going to drop this until next year.
  11. πŸ‘“ Myopia:
    • The initial progress I've made seems to have been undone by staring at the computer screen for ungodly amounts of time. We'll fix this at some future point
  12. πŸ‘₯ Relationships:
    • Mentorship & community: I'd actually say that I've achieved these goals though not in the way originally envisioned. I've found my mentors in the right software development streamers & my community in the right discords. 2022, eh? I'm crossing this off as completed.
  13. πŸ’° Money:
    • We moved back to the cheap Netherlands, and I got a side-job for about one day a week, and we're golden. It's a lot easier if you decide you don't have to live in the US.

Footnotes

  1. I've read this has been somewhat debunked, so take it with the proper grain of salt. ↩

2022-M2

Key:

  • When appropriate, a goal will have (baseline|stretch goal) next to it.
  • βœ…: baseline
  • βœ… βœ…: stretch
  • ❌: neither

Goals:

  1. πŸ›‘ No more scrolling (YouTube, Reddit, Porn, etc.): βœ…
    • I've had one or two lapses this month β€” not quite scrolling, but more like following a video vortex. Curating my own feed via Inoreader isn't quite the saving grace I imagined it to be. It still ends up quickly saturating with way too much info and there's plenty of low quality content that slips through the cracks.
  2. πŸšͺ Screen time (12|9h per day): 10h5m βœ… (=)
    • Phone (2|1h): 82m βœ… (+28m)
    • Computer (12|10h): 9h48m βœ… (-8m)
    • I've increased my allowed computer time goal by two hours because I have to accept that working on the computer is my job.
  3. ⏲ Self-monitoring: βœ…
    • I've started tracking my workday in much more detail and may start releasing more precise time for each subarea.
  4. πŸ“š Books (1 book per week): 4b βœ…
  5. πŸ—ƒ PKM: Nothing to say here for this month, except that I'm working on a set of "starting vaults" for LessWrong and 80,000 hours. Stay tuned.
  6. ✍️ Writing: βœ…
    • Articles (1 article every 2 months | every month): βœ… I've been writing, but I haven't been publishing -> in waiting for further improvements to my site.
    • Reflections (1 reflection per month | and a newsletter): βœ… You're reading it.
  7. πŸ—£ Languages (1000|2000 cards per month): 644 ❌ (-356) (506 cards behind).
    • While in Brazil, I'm actually spending less time making new cards (and more time talking with people). But I could be doing more. This is a chance to rededicate my efforts. Specifically to aim for 50 new cards per day. By the end of the month, I'll have about caught up.
  8. πŸƒ Moving
    • Rings (85|95% of the time): 86% βœ…
    1. Steps (5k|10k): 7.9k βœ…
    • Skills: ❌ I told myself I would start doing this this month but I didn't. So my goal to get on track is to start tracking my handstand time at least once a week.
  9. 🍽 Fasting (1x36h per month): ❌ Skipped fasting this week (also haven't been intermittent fasting while in Brazil), so I'm going to squeeze in two fasts this coming month (and make one of them a three-day fast.
  10. 🌏 Diet: βœ…
    • Meat (8x|4x πŸ—; 1x per month | per two months πŸ₯©πŸ₯“...): 8xπŸ— βœ…
    • Alcohol (8x|4x 🍷): 1x βœ…βœ…
  11. πŸ‘“ Myopia: (-.25|-0.25 diopters) ❌
    • I should have been more skeptical last month. Much of the improvements I noted then have subsided. This could be evidence of biased measurements in the past and sampling error, or maybe it's that I've spent way too much time this month behind the computer. Either way, I need to double down and make sure I'm working at blur distance.
  12. πŸ‘₯ Relationships:
    • Mentorship. βœ… (80,000 Hours career-coaching)
      • I've now finished The Precipice and am almost done with 80,000 hour's Career-Planning Process, so this month, I'm aiming to arrange my first coaching call.
      • I met with James Norris of Upgradable, which seemed like a promising fit, but probably won't work out. James is the kind of person whose opinion I'm likely to value too much. I already carry the weight of the world and don't want another set of expectations to carry. This reflects less on James than it does me. But it does help me clarify what I'm looking for in a mentor.
    • Community. βœ… (Open Principles Fellowship)
      • The Open Principles introduced me to several people I'll be staying in contact with, but it's already done, so my quest continues.
  13. πŸ’° Money:
    • No change.

2022-M3

Wasn't my best month β€” I've fallen behind in a few areas and need a concerted push to catch up again. Shit happens.

Key:

  • When appropriate, a goal will have (baseline|stretch goal) next to it.
  • βœ…: baseline
  • βœ… βœ…: stretch
  • ❌: neither

Goals:

  1. πŸ›‘ No more scrolling (YouTube, Reddit, Porn, etc.): βœ…
    • I had a few slips with YouTube this month. Watching a video on my phone and following the funnel. Something to double down on next month.
  2. πŸšͺ Screen time (12|9h per day): 10h15m βœ… (+10m)
    • Phone (2|1h): 95m βœ… (+13m)
    • Computer (12|10h): 8h59m βœ… (-49m)
    • Last month I wanted to increase my computer time because I thought I would be going over, but this is proving an unnecessary change.
  3. ⏲ Self-monitoring: Need another month or two before I'll share this info
  4. πŸ“š Books (1 book per week): 1b ❌
    • I'm going to count Mad Investor Chaos as a β€” in total β€” 5 book deal (when completed).
  5. πŸ—ƒ PKM PKM: Haven't made enough progress on my starting vaults since last month. This deserves more attention.
  6. ✍️ Writing: βœ…
    • Articles (1 article every 2 months | every month): ❌ Writing has taken a back seat.
    • Reflections (1 reflection per month | and a newsletter): βœ…
  7. πŸ—£ Languages (1000|2000 cards per month): 700 ❌ (-300) (856 cards behind). So I didn't catch up like I had planned to last month. In fact, I've fallen behind even just on my Anki reviews. I'm aiming this month to close some of the gap, so I'm only 500 cards behind at the end of it. This means keeping close to a target of 50 new cards a day.
  8. πŸƒ Moving
    • Rings (85|95% of the time): 82% ❌
    1. Steps (7.5k|10k):2k βœ…
    • Skills: ❌
  9. 🍽 Fasting (1x36h per month): ❌
  10. 🌏 Diet: βœ…
    • Meat (8x|4x πŸ—; 1x per month | per two months πŸ₯©πŸ₯“...): 9xπŸ— ❌ 6x other ❌ This was mostly inadvertent (in ordering the wrong thing in a language I'm not super familiar with), but it means I'll have to keep the next four months clear of non-poultry.
    • Alcohol (8x|4x 🍷): 2x βœ…βœ…
  11. πŸ‘“ Myopia: (-.25|-0.25 diopters) Skipping measurement this month because I lacked access to a measuring tape.
  12. πŸ‘₯ Relationships: Has taken a bit of back-burner (though I did meet lots of wonderful Brazilians at the sauna) - just these relationships are less professional than personal.
    • Mentorship. ❌
    • Community. ❌
  13. πŸ’° Money: ❌