It sounds like your view is that (say) a House with 5 legislators who are amazing on AI X-risk, 15 who seem like they're kinda pretty good, and 415 others is actively worse than one with 5 amazing legislators and 430 others?
I'm not sure why you think this. I'd think that most of the ways in which the pretty good legislators could be disappointing would make them more similar to the 415 others, or less influential, rather than actively worse. And often it would still be somewhat helpful to have them in Congress, e.g. they'd generally be more likely than random legislators to vote for a good AI bill that has a chance at becoming law.
One big way it could backfire to have a pretty-good-seeming legislator in the house is if they become a leading voice on AI while having misguided views on AI. But the concern about candidates who have a combination of prioritizing AI, being very competent, and having misguided views on AI feels different than just having extremely high standards for amazingness on AI X-risk.
The first is simple, unemployment. It's calculated in a way that is very favorable to the government[1], because the government decides how it's calculated and generally wants to look like things are going well. Labor force participation, a statistic that more accurately captures the share of the productive population that is being squandered, has fallen precipitously from 2005 to around 2015, enjoyed a slight increase from 2015 to 2019, and then taken a nosedive afterwards, never recovering to its 2019 high. Since 2005, a full four percent of the population - one out of 25 people - have dropped out of the labor force. This is the sort of thing that affects everything, from the national psyche to the social fabric to, of course, our ability to use the country's human resources efficiently.
Prime age employment to population ratio is a better measure, and it does not show a decline since 2005.
The measure that you picked goes down if the population gets older and includes a larger share of retired people (which it has) or if more people age 16-24 are in school rather than working (which has also been happening).
Rows 1677, 54048, 93530, and 141774 look anomalous - they should be guaranteed wins but are marked as losses. (3 of them pit 1 Laser Lance & 1 Rail Rifle vs. 2 Arachnoid Abominations.)
Criterion of rightness vs. decision procedure (also: multi-level utilitarianism)
Ideas similar to these were present to some degree among early utilitarians like Mill and Sidgwick, and the concepts were crystallized by later philosophers including Bales (1971) and Hare (1981).
Updated choice
I'll go with ABDMOPV.
My second choice would be FGHOPTV.
My process
This is mainly based on a linear model which includes all pairwise interactions, with a few adjustments:
I haven't been able to reduce the error all the way down to where it seems to be among repeat meals, but it's not that far off.
My opinion on other entries
I would attend James Camacho's AGORTV feast if I wasn't hosting my own
A big part of what Evrart is doing is frame setting. What kind of person is he, what is the nature of your relationship with him and how does that relate to specific things that you might do, what is his role in the city, etc.? He comes right out and tells you, trying to directly influence how you think of him and the social scripts & roles that you see as applying. And if you come to him in another frame that he doesn't want, he sidesteps that framing rather than interacting in the role that that frame puts him in.
Also related: his larger-than-life personality. He crafts a personality that 1) fits with the frames he wants (friendly, gregarious, nice) and which 2) gives him leeway for acting outside normal social expectations, such as by putting forward a frame or sidestepping one (it's a strange thing to do but that's Evrart being Evrart). Something about his personality even makes it seem kind of okay for him to put forward frames that seem implausible or inaccurate. (Though IMO he is not able to do this in a way that avoids seeming fishy.)
Updated observations
Taking another look after sleeping on it, I did find some of the sorts of meal-crafting patterns I was expecting, with effects from number of sweet dishes (AEGH) and number of spicy dishes (CFKSV). I actually crafted a 'number of sweet things' variable last night (AEGHP), but I just threw it in a linear model rather than looking at the data.
Updated choices
CDGHMOPS is still the best option according to my new best model. ABDMOPTV the new choice for doing well according to every decent model while still being close to the best according to my new best model. If I was picking now I'd go with the latter.
If I was going to replicate a successful historical meal, it would be
#1078, ABFGKOP, which had a quality of 18 and models well. That's a safe option for a good feast.
But I'd rather do something original, and
the best option according to the best model I've found is CDGHMOPS. Another plausible choice is AGOPRTV, which does very well in every decent model that I've tried.
Observations
With American Thanksgiving, I'd mostly expect the more dishes the better, because each person can choose which subset of the dishes they want to eat. The biggest costs are to lack of variety, because each person wants variety (e.g. some protein and some dessert) and different people want different things (e.g. spicy, not spicy, meat, meatless). Also, occasionally dishes are complementary (e.g. pie & ice cream, or cake & ice cream).
The Feasts here are apparently not like that, because that is not how the data looks. Perhaps the dishes have strong odors, or the custom is to have some of everything, or the foods have unexpected magical interactions in your belly.
Did you have a different vision for how to get really good AI X-risk legislation passed?
I'd interpreted your post has already implicitly sharing something like orthonomal's view, since I took you to be arguing that we should prioritize getting a small number of legislators who really Get It.