I'm not an AI expert. But I genuinely believe that the biggest man-made problem in the world, which is also the root cause of AI risk that should urgently be solved before AGI arrives, is the cycle behind every other man-made problem: bad values build bad systems, and bad systems make bad values. This is what I call the human values problem.
I came here specifically because I believe this community has the combination of rationality, AI knowledge, and intellectual honesty needed to find where my arguments break.
My two main goals for this discussion are:
1. To find out if the way I framed the problem is correct.
2. To hear what you think is the most direct and safest path to solving it.
---
Important Note: This post is severely shortened and only writes the strongest argument and proposed solution because my previously submitted long post was rejected. In case there are terms in this post that may not seem clear to you, I've written a dedicated definition of terms so that you can more easily understand what I actually mean. You can read the full detailed writeup here: https://www.provensuccess.ai/blog/human-values-problem-root-cause-of-ai-risk
---
ARGUMENT
Every system you grew up inside, like, your family, your school, your workplace, the government, was built to reward certain behaviors and punish others. Many of those systems rewarded behavior that reflects bad values. That shaped the values you carry today. And without knowing it, you build and work inside systems that pass those same values to the next generation.
This is the human values problem. A cycle that keeps repeating itself: people build systems that reward bad values, and those systems produce bad values in the next generation of people.
For example, in business, profit can come from two entirely different sources, you can create value or you can take it, but the current business system rewards both equally.
Creating value in business means doing work that genuinely helps workers, customers, and the community, where profit comes from solving real problems and treating people well. Products deliver genuine value to the people who use them. Workers are paid fairly and treated with dignity. They stay for years because they want to, not because they have no other option. They know the business deeply. They care about the product being built because they are treated as people who matter. Customers get genuine value and come back because the business actually helped them. Everyone involved benefits. The workers. The customers. The community. And the owner.
Taking value in business means doing work that extracts from workers, customers, and the community, where profit comes from paying as little as possible, charging more than the value delivered, and pushing costs onto everyone else. Workers are paid as little as possible regardless of how much value they create. They are treated as replaceable because replacement is cheaper than paying people their actual worth. Turnover is high because workers leave the moment they find anything better. Every corner that increases the margin gets cut even when the cost falls on workers, customers, or the environment. The owner tells themselves, and often genuinely believes, that they are responding to market demands. That anyone who did it differently wouldn't survive.
The problem here is that taking value is often right about the short term. In many markets, taking genuinely outcompetes creating. A business paying workers fairly faces higher costs than one paying the minimum it can get away with. In a price-sensitive market, the taker can undercut and win. The upfront costs of treating people well show up immediately. The benefits take months or years to appear. The system makes taking look rational while hiding the true cost to everyone over time.
This is just one example of the countless systems today that reward behavior that reflects bad values.
AI is now being built on top of those same values. A perfectly aligned AI that does exactly what people want is still the most powerful tool for causing unnecessary harm ever built, if the people using it exhibit bad values, even unknowingly or unintentionally.
So why not start solving the human values problem now, before AGI is released in the 5 to 10 year window that frontier AI organizations themselves are projecting?
SOLUTION
The foundation of the solution is one hypothesis: almost all people, if not all, just want to live a happy, successful life they won't regret. If that hypothesis is true, then the solution doesn't require convincing 8 billion people to care about strangers or the broader world. It doesn't require convincing them to adopt good values because someone told them to. It requires showing each of them, in their own specific situation, that good values are the most direct path to the life they already want, whether that life is focused on themselves, their family, or the broader world.
A person who works entirely for themselves and their loved ones, and who doesn't cause unnecessary harm to others in doing so, is already living in a way that reflects good values. The solution for that person isn't to change what they care about. It's to show them the clearest, most direct path to getting what they want without causing unnecessary harm along the way. This solution may be the most direct entry point into every person's life, regardless of where they are in the world.
The proposed solution, built on that hypothesis, has three connected parts.
The first is to change business systems to reward behavior that reflects good values and punish behavior that reflects bad values. The starting point is businesses. The approach is to show each business owner the full picture of what their choices actually cost and produce over time, so that the most profitable choice and the most ethical choice become the same choice.
The second is to build a documented, verifiable record of proof of success showing that good values produce better outcomes in real situations, reaching at least 80% of the world's population with their own proof of success before AGI arrives. That record can become part of the data AGI learns from, giving it an accurate picture of how humans actually behave when their systems reward behavior that reflects good values.
The third is to use an AI system to do both of the above: help business owners see the full picture and scale with good values, and collect that verifiable proof at scale, then expand to help all people in the world achieve their goals without causing unnecessary harm to others. My co-founder, James, has built the first version of that system. But financial constraints have prevented us from reaching our first users, and the project is currently paused.
QUESTION
What do you think is the most direct and safest path to solving the human values problem in a way that benefits everyone in the world, before AGI arrives?
I have read several posts in LW over the past few days, but I haven’t seen one that talks deeply about solving the root cause of AI risk.
I'm not an AI expert. But I genuinely believe that the biggest man-made problem in the world, which is also the root cause of AI risk that should urgently be solved before AGI arrives, is the cycle behind every other man-made problem: bad values build bad systems, and bad systems make bad values. This is what I call the human values problem.
I came here specifically because I believe this community has the combination of rationality, AI knowledge, and intellectual honesty needed to find where my arguments break.
My two main goals for this discussion are:
1. To find out if the way I framed the problem is correct.
2. To hear what you think is the most direct and safest path to solving it.
---
Important Note: This post is severely shortened and only writes the strongest argument and proposed solution because my previously submitted long post was rejected. In case there are terms in this post that may not seem clear to you, I've written a dedicated definition of terms so that you can more easily understand what I actually mean. You can read the full detailed writeup here: https://www.provensuccess.ai/blog/human-values-problem-root-cause-of-ai-risk
---
ARGUMENT
Every system you grew up inside, like, your family, your school, your workplace, the government, was built to reward certain behaviors and punish others. Many of those systems rewarded behavior that reflects bad values. That shaped the values you carry today. And without knowing it, you build and work inside systems that pass those same values to the next generation.
This is the human values problem. A cycle that keeps repeating itself: people build systems that reward bad values, and those systems produce bad values in the next generation of people.
For example, in business, profit can come from two entirely different sources, you can create value or you can take it, but the current business system rewards both equally.
Creating value in business means doing work that genuinely helps workers, customers, and the community, where profit comes from solving real problems and treating people well. Products deliver genuine value to the people who use them. Workers are paid fairly and treated with dignity. They stay for years because they want to, not because they have no other option. They know the business deeply. They care about the product being built because they are treated as people who matter. Customers get genuine value and come back because the business actually helped them. Everyone involved benefits. The workers. The customers. The community. And the owner.
Taking value in business means doing work that extracts from workers, customers, and the community, where profit comes from paying as little as possible, charging more than the value delivered, and pushing costs onto everyone else. Workers are paid as little as possible regardless of how much value they create. They are treated as replaceable because replacement is cheaper than paying people their actual worth. Turnover is high because workers leave the moment they find anything better. Every corner that increases the margin gets cut even when the cost falls on workers, customers, or the environment. The owner tells themselves, and often genuinely believes, that they are responding to market demands. That anyone who did it differently wouldn't survive.
The problem here is that taking value is often right about the short term. In many markets, taking genuinely outcompetes creating. A business paying workers fairly faces higher costs than one paying the minimum it can get away with. In a price-sensitive market, the taker can undercut and win. The upfront costs of treating people well show up immediately. The benefits take months or years to appear. The system makes taking look rational while hiding the true cost to everyone over time.
This is just one example of the countless systems today that reward behavior that reflects bad values.
AI is now being built on top of those same values. A perfectly aligned AI that does exactly what people want is still the most powerful tool for causing unnecessary harm ever built, if the people using it exhibit bad values, even unknowingly or unintentionally.
So why not start solving the human values problem now, before AGI is released in the 5 to 10 year window that frontier AI organizations themselves are projecting?
SOLUTION
The foundation of the solution is one hypothesis: almost all people, if not all, just want to live a happy, successful life they won't regret. If that hypothesis is true, then the solution doesn't require convincing 8 billion people to care about strangers or the broader world. It doesn't require convincing them to adopt good values because someone told them to. It requires showing each of them, in their own specific situation, that good values are the most direct path to the life they already want, whether that life is focused on themselves, their family, or the broader world.
A person who works entirely for themselves and their loved ones, and who doesn't cause unnecessary harm to others in doing so, is already living in a way that reflects good values. The solution for that person isn't to change what they care about. It's to show them the clearest, most direct path to getting what they want without causing unnecessary harm along the way. This solution may be the most direct entry point into every person's life, regardless of where they are in the world.
The proposed solution, built on that hypothesis, has three connected parts.
The first is to change business systems to reward behavior that reflects good values and punish behavior that reflects bad values. The starting point is businesses. The approach is to show each business owner the full picture of what their choices actually cost and produce over time, so that the most profitable choice and the most ethical choice become the same choice.
The second is to build a documented, verifiable record of proof of success showing that good values produce better outcomes in real situations, reaching at least 80% of the world's population with their own proof of success before AGI arrives. That record can become part of the data AGI learns from, giving it an accurate picture of how humans actually behave when their systems reward behavior that reflects good values.
The third is to use an AI system to do both of the above: help business owners see the full picture and scale with good values, and collect that verifiable proof at scale, then expand to help all people in the world achieve their goals without causing unnecessary harm to others. My co-founder, James, has built the first version of that system. But financial constraints have prevented us from reaching our first users, and the project is currently paused.
QUESTION
What do you think is the most direct and safest path to solving the human values problem in a way that benefits everyone in the world, before AGI arrives?