Discover more from Mike’s Blog
Getting the Easy Alignment Problems Wrong
Against Sesame Maximizers
I’m not sure if I really believe in the AI doomerism stuff, but one thing I find discouraging is how often we screw up similar problems that are analogous to AI alignment (predicting how a complex system will behave given a set of rules) — but much easier.
Seriously, I’m not kidding. The Food Allergy Safety, Treatment, Education and Research (Faster) Act, which was passed with bipartisan support, basically said:
If you’re selling a food that contains sesame, it has to be labelled.
If you’re selling a food that doesn’t contain sesame, you need to follow a bunch of burdensome rules, including careful cleaning of manufacturing equipment, to be absolutely 100% sure there is no contamination.
The result is that now a lot of companies are choosing to add small amounts of sesame to products that had previously been sesame-free, and labeling it. According to the article, this is causing some pretty serious hardship for people who are allergic to sesame:
The article also quotes some consumer protection advocates and politicians who had lobbied for the passage of the law. None of them apologized or took responsibility for the situation.
This is a general problem in public policy. Sometimes political conflicts happen when different groups have competing interests, public choice theory, etc, and advocate for policies accordingly. But other times there are these alignment problems where we simply fail to correctly predict what the results of a policy will be, so the results don’t match our desired outcome even when we get the policy we’re advocating for.
Anyway, this is just something to think about. As far as I understand, AI alignment is a technical problem about predicting the behavior of a complex system given a set of rules that the system is following (the objective function / loss function). I think these examples from public policy make good metaphors, and help explain the problem of unpredictability of complex systems even when you get to set the rules for them.
They also might be more believable as examples because they’re actually real — the famous Paperclip Maximizer sounds like science fiction to a lot of people, but a bunch of US lawmakers really did just try to program a sesame-minimizer and got a sesame-maximizer instead.
Like I said, I don’t know if I really believe in the AI doomerism stuff or not. Eliezer Yudkowsy was interviewed on the Lunar Society Podcast (great podcast btw) recently and made a compelling case that if an AGI were developed, with our current lack of knowledge about alignment, and it had the ability to physically interact with the world, that it would be very likely to destroy the human race.
I don’t really have counterarguments to what he said, except that I’m not sure if the current trajectory we’re on will lead to an AGI with agency and the ability to physically interact with the world. At least, it doesn’t seem inevitable to me that this will happen if we keep making more and more advanced ChatGPTs.
But if we do end up in a situation where we have some extremely complex system, much smarter than us, with agency and the ability to interact with the world… then yeah, I’m not very optimistic that we’ll be able to predict and constrain its behavior even if we have the power to set the rules for it.
Sidenote — I was mostly just using the sesame story as a metaphor here, but while I’m talking about it I’ll also mention a related issue. Apparently there have been some attempts by vegans to insist that any vegan options a restaurant provides must be cooked on separate equipment that isn’t used to cook meat.
In 2019, someone sued Burger King (though the lawsuit was dismissed) after finding out that their Impossible Whopper was cooked on a grill that meat had also been cooked on.
PETA has actually chimed in with a bit of sanity, urging vegans not to do this, since making it more difficult, expensive, and legally-risky for restaurants to offer vegan options will probably lead to fewer vegan options being offered. Article about it here.
Anyway, I’m a vegetarian myself, and I’d like it if we could get this alignment problem right and not discourage restaurants from offering vegetarian options by making it more difficult.