Feed

No, human brains are not more efficient than computers

Epistemic status: grain of salt. There's lots of uncertainty in how many FLOP/s the brain can perform.

In informal debate, I've regularly heard people say something like, "oh but brains are so much more efficient than computers" (followed by a variant of "so we shouldn't worry about AGI yet"). Putting aside the weakly argued AGI skepticism, brains actually aren't all that much more efficient than computers (at least not in any way that matters).

The first problem is that these people are usually comparing the energy requirements of training large AI models to the power requirements of running the normal waking brain. These two things don't even have the same units.

The only fair comparison is between the trained model and the waking brain or between training the model and training the brain. Training the brain is called evolution, and evolution isn't particularly known for its efficiency.

Let's start with the easier comparison: a trained model vs. a trained brain. Joseph Carlsmith estimates that the brain delivers roughly 11 petaFLOP/s (=101510^{15} floating-point operations per second)1. If you eat a normal diet, you're expending roughly 101310^{-13} J/FLOP.

Meanwhile, the supercomputer Fugaku delivers 450450 petaFLOP/s at 3030 MW, which comes out to about 1011.510^{-11.5} J/FLOP…. So I was wrong? Computers require almost 500500 times more energy per FLOP than humans?

Supercomputer J/FLOPHuman J/FLOP\frac{\text{Supercomputer J}/\text{FLOP}}{\text{Human J} /\text{FLOP}}

Pasted image 20220906142829.png

What this misses is an important practical point: supercomputers can tap pretty much directly into sunshine; human food calories are heavily-processed hand-me-downs. We outsource most of our digestion to mother nature and daddy industry.

Even the most whole-foods-grow-your-own-garden vegan is 22-33 orders of magnitude less efficient at capturing calories from sunlight than your average device2. That's before animal products, industrial processing, or any of the other Joules it takes to run a modern human.

After this correction, humans and computers are about head-to-head in energy/FLOP, and it's only getting worse for us humans. The fact that the brain runs on so little actual juice suggests there's plenty of room left for us to explore specialized architectures, but it isn't the damning case many think it is. (We're already seeing early neuromorphic chips out-perform neurons' efficiency by four orders of magnitude.)

Electronic efficiencyBiological efficiency\frac{\text{Electronic efficiency}}{\text{Biological efficiency}}

Pasted image 20220906143040.png

But what about training neural networks? Now that we know the energy costs per FLOP are about equal, all we have to do is compare FLOPs required to evolve brains to the FLOPs required to train AI models. Easy, right?

Here's how we'll estimate this:

  1. For a given, state-of-the-art NN (e.g., GPT-3, PaLM), determine how many FLOP/s it performs when running normally.
  2. Find a real-world brain which performs a similar number of FLOP/s.
  3. Determine how long that real-world brain took to evolve.
  4. Compare the number of FLOPs (not FLOP/s) performed during that period to the number of FLOPs required to train the given AI.

Fortunately, we can piggyback off the great work done by Ajeya Cotra on forecasting "Transformative" AI. She calculates that GPT-3 performs about 101210^{12} FLOP/s3, or about as much as a bee.

Going off Wikipedia, social insects evolved only about 150 million years ago. That translates to between 103810^{38} and 104410^{44} FLOPs. GPT-3, meanwhile, took about 1023.510^{23.5} FLOPs. That means evolution is 101510^{15} to 102210^{22} times less efficient.

log10(total FLOPs to evolve bee brains)\log_{10}\left(\text{total FLOPs to evolve bee brains}\right)

Pasted image 20220906143416.png

Now, you may have some objections. You may consider bees to be significantly more impressive than GPT-3. You may want to select a reference animal that evolved earlier in time. You may want to compare unadjusted energy needs. You may even point out the fact that the Chinchilla results suggest GPT-3 was "significantly undertrained".

Object all you want, and you still won't be able to explain away the >1515 OOM gap between evolution and gradient descent. This is no competition.

What about other metrics besides energy and power? Consider that computers are about 10 million times faster than human brains. Or that if the human brain can store a petabyte of data, S3 can do so for about $20,000 (2022). Even FLOP for FLOP, supercomputers already underprice humans.4 There's less and less for us to brag about it.

$/(Human FLOP/s)$/(Supercomputer FLOP/s)\frac{\$/(\text{Human FLOP/s})}{\$/(\text{Supercomputer FLOP}/s)}

Pasted image 20220906143916.png

Brain are not magic. They're messy wetware, and hardware will catch up has caught up.

Postscript: brains actually might be magic. Carlsmith assigns less than 10% (but non-zero) probability that the brain computes more than 102110^{21} FLOP/s. In this case, brains would currently still be vastly more efficient, and we'd have to update in favor of additional theoretical breakthroughs before AGI.

If we include the uncertainty in brain FLOP/s, the graph looks more like this:

Supercomputer J/FLOPHuman J/FLOP\frac{\text{Supercomputer J}/\text{FLOP}}{\text{Human J} /\text{FLOP}}

Pasted image 20220906150914.png

(With a mean of ~101910^{19} and median of 830830.)

Appendix

Squiggle snippets used to generate above graphs. (Used in conjunction with obsidian-squiggle).

brainEnergyPerFlop = {
	humanBrainFlops = 15; //10 to 23;	// Median 15; P(>21) < 10%
	humanBrainFracEnergy = 0.2;
	humanEnergyPerDay = 8000 to 10000; // Daily kJ consumption
	humanBrainPower = humanEnergyPerDay / (60 * 60 * 24); // kW
	humanBrainPower * 1000 / (10 ^ humanBrainFlops) // J / FLOP
}

supercomputerEnergyPerFlop = {
    // https://www.top500.org/system/179807/ 
	power = 25e6 to 30e6; // J
	flops = 450e15 to 550e15;
	power / flops
}

supercomputerEnergyPerFlop / brainEnergyPerFlop
humanFoodEfficiency = {
	photosynthesisEfficiency = 0.001 to 0.03
	trophicEfficiency = 0.1 to 0.15
	photosynthesisEfficiency * trophicEfficiency 
}

computerEfficiency = {
    solarEfficiency = 0.15 to 0.20
    transmissionEfficiency = 1 - (0.08 to .15)
    solarEfficiency * transmissionEfficiency
}

computerEfficiency / humanFoodEfficiency
evolution = {
    // Based on Ayeja Cotra's "Forecasting TAI with biological anchors"
    // All calculations are in log space.
	
	secInYear = log10(365 * 24 * 60 * 60);
	
	// We assume that the average ancestor pop. FLOP per year is ~constant.
	// cf. Humans at 10 to 20 FLOP/s & 7 to 10 population
	ancestorsAveragePop = uniform(19, 23); # Tomasik estimates ~1e21 nematodes
    ancestorsAverageBrainFlops = 2 to 6; // ~ C. elegans
	ancestorsFlopPerYear = ancestorsAveragePop + ancestorsAverageBrainFlops + secInYear;

	years = log10(850e6) // 1 billion years ago to 150 million years ago
	ancestorsFlopPerYear + years
}
humanLife$ = 1e6 to 10e6
humanBrainFlops = 1e15
humanBrain$PerFlops = humanLife$ / humanBrainFlops 

supercomputer$ = 1e9
supercomputerFlops = 450e15
supercomputer$PerFlop = supercomputer$ / supercomputerFlops


supercomputer$PerFlops/humanBrain$PerFlops

Footnotes

  1. Watch out for FLOP/s (floating point operations per second) vs. FLOPs (floating point operations). I'm sorry for the source of confusion, but FLOPs usually reads better than FLOP.

  2. Photosynthesis has an efficiency around 1%, and jumping up a trophic level means another order of magnitude drop. The most efficient solar panels have above 20% efficiency, and electricity transmission loss is around 10%.

  3. Technically, it's FLOP per "subjective second" — i.e., a second of equivalent natural thought. This can be faster or slower than "truth thought."

  4. Compare FEMA's value of a statistical life at $7.5 million to the $1 billion price tag of the Fukuga supercomputer, and we come out to the supercomputer being a fourth the cost per FLOP.

Rationalia starter pack

LessWrong has gotten big over the years: 31,260 posts, 299 sequences, and more than 120,000 users.1 It has budded offshoots like the alignment and EA forums and earned itself recognition as "cult". Wonderful!

There is a dark side to this success: as the canon grows, it becomes harder to absorb newcomers (like myself).2 I imagine this was the motivation for the recently launched "highlights from the sequences".

To make it easier on newcomers (veterans, you're also welcome to join in), I've created an Obsidian starter-kit for taking notes on the LessWrong core curriculum (the Sequences, CodexHPMOR, best of, concepts, various jargon, and other odds and ends).

There's built-in support to export notes & definitions to Anki, goodies for tracking your progress through the notes, useful metadata/linking, and pretty visualizations of rationality space…

vault-graph.png

It's not perfect — I'll be doing a lot of fine-tuning as I work my way through all the content — but there should be enough in place that you can find some value. I'd love to hear your feedback, and if you're interested in contributing, please reach out! I'll also soon be adding support for the AF and the EAF .

More generally, I'd love to hear your suggestions for new aspiring rationalists. For example, there was a round of users proposing alternative reading orders about a decade ago (by Academianjimrandomh, and XiXiDu) and may be worth revisiting in 2022.

Footnotes

  1. From what I can tell using the graphql endpoint.

  2. Already a decade ago, jimrandomh was worrying about LW's intimidation factor — we're now about an order of magnitude ahead.

My Personality

Most personality tests are bullshit. Even the Big-5 are a bit overhyped. Take it from the experts:

"Personality scales tend to show longterm retest correlations from .30 to .80 over intervals of up to 30 years." [1]

".30 to .80" sounds good until you remember that even the upper limit means the first test score explains only about 64% of the variance in later test scores. At the median retest correlation of .57, almost 70% of your personality is explained by something other than your continuity of existence. Granted, these numbers are great by the standards of psychology, but they're rather dismal for any substantive field.

As for the rest: Myers-Briggs, Enneagram, and RIASEC... it's total nonsense. It's still fun — maybe even useful as a vague suggestion of behavioral flavor frozen in time — but ultimately such hogwash that it raises the question, "why take the time to publish this?"

The Big 5+1

aka: "OCEAN", "HEXACO"

Myers-Briggs

ENTJ-A (Commander)

Enneagram

Holland Types

aka: "RIASEC"

2022-Q2

I fell off the wagon.

This was supposed to be the year of the quantified self. I set out to track every minute of my time across several dozen goals in thirteen categories.

The effort began strong. For three months I was off of social media, exercising super consistently, timing mostly everything, and on track towards my goals.

Then... something happened.

Between April and July — the tracking vanished. I returned to my vices, and if anything, I became more distractible than I've been in ages. Not a second of tracking, and all hopes of achieving my goals in the trash.

What happened?

My leading theory is that it was moving-related. In February, I moved to Brasília for two months. In April, I moved back to the US, and by June, I was back in the Netherlands.

Brasília was in many ways a delight: great weather, amazing fruit, a sauna and gym two minutes from my door, and everything extremely affordable. Other things were less than great: the internet speeds (at least at first), my workstation (I appropriated a low-res TV for a monitor on a minuscule kitchen table), no AC.

These things seem minor, but they add up over time. If it takes 60 seconds to install a new package, you open up Hacker News and end up wasting five minutes. Each five session chips at your attentional capacity. The frustration builds and burns you out. DX matters.

Still, mostly I was on track.

What probably caused the discipline to falter was the disruption of coming back. Moves are great opportunities to change behaviors, but this works in either direction, and the asymmetry of habit formation means you have to be extra careful.

When you're moving a bunch in a short period of time, you have to be even more careful because 2 Areas/3 Notes/3 Sciences/9 Psychology & Psychiatry/Ego depletion1 comes into play. You exhaust your willpower and become more susceptible to developing bad habits with each successive move.

Seasoned digital nomads probably have their tricks to get around this, but that's not me yet. Lesson learned.

A few other problems at play:

  • My Obsidian has again become a disordered wreck. I keep on trying to impose fragile top-down hierarchies on the notes, and it ends up breaking everything.
  • In a related vein, I've come to the conclusion that using Obsidian for both task management and knowledge management is bad practice. Tasks should vanish when done. If they linger around they'll muck up your access to the more important persistent knowledge. I've moved to trying out Linear instead.
  • Tracking was much too manual. I was tracking in Obsidian, which proved too unstructured (similar concern to "not using Obsidian for task management"), so I moved on to Google Sheets, which is a nightmare (as you know). This time, I'm going to give Airtable a shot (which takes inspiration & validation from the professionals).
  • At the start of the year, I redesigned my website, because using 3 Resources/Rationalia/scripts/node_modules/lodash/next.js for a static site was overkill, but then I went too far in the opposite direction (towards raw, uncut html). The problem with this is that regularly publishing is the best for me to orient my review and tracking processes. When that output process becomes too unergonomic, it clogs up the rest of the pipeline. I'm now using Astro with a custom, simplified export pipeline (a successor to my previous solution & a set of plugins to recreate Obsidian-flavored markdown in the Unified.js ecosystem). This isn't public yet, but it will be when I've ironed out the kinks.

Whatever the reason, it's in the past, and every day is a chance to start fresh.

Let's try again

We've still got basically half a year left. What can we recover?

  1. 🛑 No more scrolling (YouTube, Reddit, Porn, etc.):
    • Right. That failed miserably — I even ended up caving and finally getting on Twitter. 🤷‍♂️ I still agree with the intent but can't deny the value of being up to date with Hacker News, tech Twitter, and edu-Youtube.
    • My idea of a solution was using Inoreader. There, the problem was that it was filling much too quickly and that much of it was low quality. This time around I need to be more diligent in removing feeds that don't serve me.
    • I'm going to try again. This time, I'll allow myself a dash of Hacker News a day — call it part of the job requirements. Youtube, I'll get through Inoreader, and the rest, hopefully never. (I may relax this further, and allow myself some maximum amount of time per day on these trash platforms.)
  2. 🚪 Screen time:
    • I'm scrapping the limit for desktop (because programming is my job).
    • The main obstacle to actually tracking this was that I was manually copying the information every week. This is a perfect opportunity for automation (there is fortunately an API, but you have to call it on device). Until I get access to the data, I'm not going to require myself to track this, but the goal stands: Less than an hour a day as a baseline; less than two as a stretch.
  3. Self-monitoring:
    • Toggl was easy and intuitive; my main obstacle was that I had defined too many different projects and types of tasks.
    • Time to simplify: Only three projects (work, personal, misc). Only a handful of allowed labels: programming, reading, watching, wasting time, organizing, meeting, writing.
    • Also, no more manual copying stuff over. I'm a programmer and should know better. Same for Apple Health information about exercise.
  4. 📚 Books (1 book per week):
    • I'm a bit behind schedule. 19 books in fact. But that's ok, there's plenty of time to catch up. That said, I am scrapping all of the specific goals like read X books in French, Y by this author. I'll just read what I want to read.
  5. 🗃 PKM:
    • I'm going to remove all specific goals and just commit to regularly maintenance.
  6. ✍️ Writing:
    • I haven't been writing, but I have plenty of room to catch up with my goal of 6 articles.
    • I missed M4-M7 & Q1, but whatever. For the rest of the year, I'm forbidding myself from including any quantitative result in my reviews that I haven't automated.
  7. 🗣 Languages:
    • I'm scrapping this goal. It was too ambitious from the start. I do want to catch up again, but I have one or two tools I want to finish up before I actually start learning Chinese.
    • My main goal in this category is to just catch up on Anki again & to have no overdue cards in any of my principal decks (General, French, Portuguese, Dutch). Stretch goal if I can work in Italian and German.
  8. 🏃 Moving
    • Subjectively, I'm happy enough with my movement. I'm going to avoid setting quantitative goals until I've automated the information capture.
  9. 🍽 Fasting
    • When I fell off the productivity wagon, I also fell off the IF wagon for the first time in 5 years (but I'm back again).
    • This has fallen by the wayside but it's totally recoverable. I'm going to start committing to one day (Monday) a week for the rest of the year.
  10. 🌏 Diet:
    • I was tracking meat & alcohol consumption. In hindsight, it required a bit too much input. I'm going to drop this until next year.
  11. 👓 Myopia:
    • The initial progress I've made seems to have been undone by staring at the computer screen for ungodly amounts of time. We'll fix this at some future point
  12. 👥 Relationships:
    • Mentorship & community: I'd actually say that I've achieved these goals though not in the way originally envisioned. I've found my mentors in the right software development streamers & my community in the right discords. 2022, eh? I'm crossing this off as completed.
  13. 💰 Money:
    • We moved back to the cheap Netherlands, and I got a side-job for about one day a week, and we're golden. It's a lot easier if you decide you don't have to live in the US.

Footnotes

  1. I've read this has been somewhat debunked, so take it with the proper grain of salt.

2022-M2

Key:

  • When appropriate, a goal will have (baseline|stretch goal) next to it.
  • ✅: baseline
  • ✅ ✅: stretch
  • ❌: neither

Goals:

  1. 🛑 No more scrolling (YouTube, Reddit, Porn, etc.): ✅
    • I've had one or two lapses this month — not quite scrolling, but more like following a video vortex. Curating my own feed via Inoreader isn't quite the saving grace I imagined it to be. It still ends up quickly saturating with way too much info and there's plenty of low quality content that slips through the cracks.
  2. 🚪 Screen time (12|9h per day): 10h5m ✅ (=)
    • Phone (2|1h): 82m ✅ (+28m)
    • Computer (12|10h): 9h48m ✅ (-8m)
    • I've increased my allowed computer time goal by two hours because I have to accept that working on the computer is my job.
  3. Self-monitoring: ✅
    • I've started tracking my workday in much more detail and may start releasing more precise time for each subarea.
  4. 📚 Books (1 book per week): 4b ✅
  5. 🗃 PKM: Nothing to say here for this month, except that I'm working on a set of "starting vaults" for LessWrong and 80,000 hours. Stay tuned.
  6. ✍️ Writing: ✅
    • Articles (1 article every 2 months | every month): ✅ I've been writing, but I haven't been publishing -> in waiting for further improvements to my site.
    • Reflections (1 reflection per month | and a newsletter): ✅ You're reading it.
  7. 🗣 Languages (1000|2000 cards per month): 644 ❌ (-356) (506 cards behind).
    • While in Brazil, I'm actually spending less time making new cards (and more time talking with people). But I could be doing more. This is a chance to rededicate my efforts. Specifically to aim for 50 new cards per day. By the end of the month, I'll have about caught up.
  8. 🏃 Moving
    • Rings (85|95% of the time): 86% ✅
    1. Steps (5k|10k): 7.9k ✅
    • Skills: ❌ I told myself I would start doing this this month but I didn't. So my goal to get on track is to start tracking my handstand time at least once a week.
  9. 🍽 Fasting (1x36h per month): ❌ Skipped fasting this week (also haven't been intermittent fasting while in Brazil), so I'm going to squeeze in two fasts this coming month (and make one of them a three-day fast.
  10. 🌏 Diet: ✅
    • Meat (8x|4x 🍗; 1x per month | per two months 🥩🥓...): 8x🍗 ✅
    • Alcohol (8x|4x 🍷): 1x ✅✅
  11. 👓 Myopia: (-.25|-0.25 diopters) ❌
    • I should have been more skeptical last month. Much of the improvements I noted then have subsided. This could be evidence of biased measurements in the past and sampling error, or maybe it's that I've spent way too much time this month behind the computer. Either way, I need to double down and make sure I'm working at blur distance.
  12. 👥 Relationships:
    • Mentorship. ✅ (80,000 Hours career-coaching)
      • I've now finished The Precipice and am almost done with 80,000 hour's Career-Planning Process, so this month, I'm aiming to arrange my first coaching call.
      • I met with James Norris of Upgradable, which seemed like a promising fit, but probably won't work out. James is the kind of person whose opinion I'm likely to value too much. I already carry the weight of the world and don't want another set of expectations to carry. This reflects less on James than it does me. But it does help me clarify what I'm looking for in a mentor.
    • Community. ✅ (Open Principles Fellowship)
      • The Open Principles introduced me to several people I'll be staying in contact with, but it's already done, so my quest continues.
  13. 💰 Money:
    • No change.