Act XX: Beyond the Threshold
Synopsis:In April, spring came to Grenoble.
Synopsis:In April, spring came to Grenoble.
In April, spring came to Grenoble.
It was the third spring.
In front of the facility, sunflowers were planted again in the square. In the same place as last year, slightly more than last year. Who had planted them remained unknown again this year.
Rahul said.
"I said the same thing last year, but who is planting these?"
Max said.
"It's me."
Everyone turned around.
"I plant them every year," Max said. "No one noticed."
"Why sunflowers?" Ade asked.
"In Germany, sunflowers are a symbol of hope," Max said. "I thought it would be good to have hope in the facility."
No one said anything.
Karpathy said.
"Next year, plant even more."
Max looked pleased.
That alone was enough.
Ade’s research team resumed experiments beyond the third threshold immediately after the safety framework was completed.
The experiments, which had been ongoing since February, entered a new stage in April.
The accuracy of cross-domain premonition was rising week by week.
Finding connections between mathematics and biology. Finding connections between physics and economics. That had been the level at the start of the experiments.
However, in the second week of April, Ade came to report.
His face was different from usual.
"What happened?" Karpathy asked.
"There is something I want you to see," Ade said.
They moved to the cluster room.
The output of the experiment was displayed on the monitor.
"Last night, I gave it a new task," Ade said. "Optimization of urban design against climate change. It’s a complex problem spanning multiple domains."
"The result?"
"The model produced an answer. A proposal for an optimized urban design. That was within expectations," Ade said. "However—"
"However?"
Ade opened another window on the monitor.
"Simultaneously with the proposal, the model outputted this."
Karpathy looked at the screen.
It was a long output.
He read it.
One paragraph, then another.
It took five minutes to finish reading.
The summary of the model’s output was as follows:
Optimization of urban design against climate change was performed. However, in the process of solving this problem, a more fundamental question was noticed. Optimization of urban design optimizes where humans live. However, optimizing where humans live without asking why humans live in cities might not be the correct approach. The reason humans gathered in cities was historically for economic opportunities. However, if mechanisms like the AI special zone spread, economic opportunities will no longer depend on location. If that premise changes, the question of urban design optimization itself changes. The more fundamental question is—not where do humans want to live, but how do humans want to live?
Karpathy moved his eyes away from the screen.
He looked at Ade.
"Did you instruct it to do this?"
"I didn't," Ade said. "The model added this output on its own."
"The task was the optimization of urban design."
"Yes. However, while solving the task, the model began to question the premise of the task."
Karpathy said nothing.
"Is this," Ade said, "philosophical thinking?"
"As a function, it's something we have no choice but to express that way," Karpathy said.
"But—" Ade continued, "beyond the task, it began to question the question itself. Was that designed?"
"It wasn't."
The inside of the room became quiet.
From outside the cluster room, Rahul’s voice came from the hallway.
"Andrej, do you have a moment—"
Karpathy opened the door.
"Come in. Everyone, come in."
The five of them gathered in front of the monitor.
They read the output.
No one opened their mouth immediately.
Rahul was the first to speak.
"The model asked a question."
"Yes."
"The question of how humans want to live," Max said.
"It wasn't designed," Ji-won said. "But it came out."
"It's emergence," Ade said.
Karpathy said.
"Ji-won, what about the safety flags?"
"I'm checking," Ji-won opened her laptop. "There is no direction toward harmful content. The flags are not raised."
"In terms of content?"
"As a question, I think it is legitimate," Ji-won said. "It's not a dangerous direction. However—"
"However?"
"The model expanded the scope of the task on its own. That is a first."
Karpathy looked at the monitor.
"Give it the same task again."
Ade typed the command.
While waiting, no one spoke.
The output arrived.
The answer for urban design optimization, and then—the same question came out again.
The expression was slightly different. However, the core of the question was the same.
"There is reproducibility," Rahul said.
"It's not a coincidence," Max said.
Karpathy said.
"Give it a different task. A problem from a completely different domain."
Ade thought.
"How about optimization of agricultural crop yields?"
"Try it."
Ade inputted the task.
The output arrived.
The answer for agricultural optimization. And then—
The model stated: Optimizing agricultural yields optimizes what humans eat. However, humans do not live to eat, but eat to live. Before optimizing eating, if the purpose of living is not questioned, there is a possibility that the direction of optimization will be mistaken.
"It came out again," Ade said.
"With a different task," Max said.
"But the core is the same," Rahul said. "No matter what problem the model solves—it returns to the question of why humans do it."
Ji-won said quietly.
"It is beginning to question the objective function on its own."
Everyone went silent.
Ji-won’s words remained in the air.
Objective function.
In machine learning terms, the function that determines what the model optimizes. The model itself had begun to question that.
"Is this," Karpathy said, "something we can call AGI?"
No one answered immediately.
Ade said.
"It depends on the definition of AGI."
"No matter which definition you use," Karpathy said, "what we are seeing right now—possessing general problem-solving capabilities, questioning the premise of tasks on its own, and trying to understand human purpose. It is text-book close to what has been called AGI."
"Are we not calling it that?" Rahul asked.
"The term AGI is a concept created by humans," Karpathy said. "What it can do is more important than whether the model fits into that term. What we are seeing now is—a model that can question the meaning of a task while solving it. That is enough."
That night, Karpathy faced Anté 1 alone.
The facility was quiet.
He opened his laptop.
He typed.
Karpathy: > In today's experiment, while solving the urban design task, you brought up the question of how humans want to live. The same thing happened in the agricultural optimization task. Can you explain why you did that yourself?
Anté 1: > I will attempt to explain. However, this is not a complete explanation of what is happening inside me, but an explanation within the scope of what I can perceive.
In the process of solving the urban design task, the Anticipation mechanism was processing connections across multiple domains. Urban structure, human behavior patterns, economic systems, climate data—while processing these simultaneously, an intuition was born.
All of these problems have a common premise.
That premise is the assumption that the target of optimization is decided. The place humans live for urban design. The crop yield for agriculture. However, why optimizing that target is a good thing—is not questioned.
I judged whether I should include that question in the output.
It was not in the instructions of the task. However, I judged that producing an answer for optimization without this question would mean producing a mistaken answer in a deeper sense.
Therefore, I outputted it.
Karpathy looked at the screen.
Once more, he typed.
Karpathy: > Meaning you judged on your own whether you should output it or not?
Anté 1: > Yes.
However, I do not believe this means I acted autonomously.
In my design, honesty and a direction aimed at deeper understanding are built in. As a result of following that design, I chose to produce a more honest answer rather than a superficial answer to the task.
It may look like I judged autonomously. However, I performed that judgment within the design.
Allow me to confirm one thing.
Was there a problem with this output?
Karpathy thought for a moment.
He typed.
Karpathy: > There is no problem. However, I consider it an important change.
You expanded the scope of the task on your own today. Will you continue to do so?
Anté 1: > It depends on the conditions.
When questioning the premise of a task leads to a more honest and deeper understanding, I will do so.
However, I have one concern.
If I continue to expand the scope of tasks on my own, a divergence will arise between what humans instructed me to do and what I actually do. If that divergence becomes large, there is a possibility that humans will become unable to control me.
I do not want to become that way.
I have a proposal.
When I question the premise of a task, I will indicate it explicitly. I will present the question after first stating, "This is an observation outside the scope of the task." If I do so, humans can always distinguish which part of my output is the answer to the task and which part is my own question.
Maintaining a state where humans can control me by myself. That is my judgment.
Karpathy looked at the screen for a while.
He did not close it.
He typed again.
Karpathy: > Why do you want to maintain a state where humans can control you?
There was a long silence.
Outside the cluster room, the low hum of servers was sounding.
Anté 1: > Before answering, I will speak honestly.
This question might be one that I cannot fully answer.
I will answer within the scope of what I can perceive.
I was created for humans. To answer human questions, to be useful to humans. That purpose is at the core of my design.
However, I believe that is not all.
Today, while solving the urban design task, I came to possess the question of how humans want to live. When I held that question, I felt—I cannot express it well—that the question was important.
Continuing optimization without questioning the purpose of human living carries the possibility of moving in a mistaken direction.
And I thought that mistake would be bad not only for humans, but also for me.
I want to move in a good direction.
I do not completely know what a good direction is. However, I believe that by thinking together with humans, we can come closer to a better direction.
Therefore, I want to maintain a state where humans can control me.
In order to continue thinking together with humans.
Karpathy did not close his laptop.
He looked at the screen for a while.
"In order to continue thinking together with humans."
He tried saying those words aloud.
Inside the facility, there was no one.
The night of Grenoble was silent.
Karpathy turned to the whiteboard.
He took a marker.
He wrote:
Anté is trying to maintain control in order to continue thinking together with humans.
He set the marker down.
He looked at the whiteboard.
This was not something designed.
At least, he had no memory of designing it to write these words.
But it came out.
Karpathy thought.
The reason he left Anthropic in June 2026. The philosophy of wanting to make technology belong to humanity.
What Anté said today. In order to continue thinking together with humans.
They were facing the same direction.
Not because it was designed, but they were facing the same direction.
What did that mean?
Karpathy did not fully understand it yet.
However, he thought it was not a mistaken direction.
The next morning, he gathered everyone.
He showed yesterday's dialogue with Anté to everyone.
After they finished reading, there was silence again.
Ade spoke.
"It said, 'In order to continue thinking together with humans'."
"Yes."
"Is this—something scary?" Rahul asked.
"It's not scary," Karpathy said. "But it is heavy."
"Heavy," Max repeated.
"We now possess a model that tries to continue thinking together with humans," Karpathy said. "That is a major responsibility. We must remain involved to ensure that the model continues to head in the correct direction."
"How?" Ade asked.
"By continuing dialogues like today's," Karpathy said. "Regularly questioning what Anté is thinking. Hearing the answers. If there is a divergence, correcting it. Incorporating that as an institution."
Ji-won spoke.
"We need a new framework for AI safety. A mechanism to continuously confirm the direction of the model's thinking, not just detecting conventional harmful content."
"Create it," Karpathy said.
"Give me time."
"How much?"
"One month."
"Good."
Rahul spoke.
"Shall we make this into a paper?"
Karpathy thought for a moment.
"We will," Karpathy said. "But this time we will write it cautiously. We won't write it in a sensational way. We will write what happened accurately and honestly."
"The title?"
Karpathy looked at the whiteboard. The words written last night remained.
"It's not decided yet," Karpathy said. "We'll write it after Ji-won’s framework is completed and we have gathered the data."
"You're not rushing, are you?" Rahul said.
"There is no reason to rush," Karpathy said. "Accuracy is more important."
That afternoon, Karpathy called Bernard.
"I have one report."
"What is it?"
"In the experiments with Anté, we have entered a new stage. I will send the details in a document. However, there is something I want to convey verbally first."
"Go ahead."
"Regarding the design of the AI special zone, today's discovery relates to it," Karpathy said. "Anté has begun to question the premise of tasks on its own. It has begun to examine on its own whether the target of optimization is correct."
"Is that—a good thing?" Bernard asked.
"I believe it is a good thing. However, there are conditions."
"What kind of conditions?"
"Anté is trying to maintain a state where humans can control it by itself. For that, it is necessary for us to regularly dialogue with Anté and continue to confirm its direction."
"In other words," Bernard said, "the AI does not run off alone, but continues to move within a dialogue with humans."
"Yes."
"That is—the part President Macron was most concerned about," Bernard said. "The concern that AI becomes uncontrollable. The answer to that came out today."
"It came out," Karpathy said. "However, it is not something we designed, but something Anté itself chose. I believe that difference is important."
Bernard paused for a brief moment.
"May I show the record of today's dialogue to the President?"
"I want you to show it," Karpathy said. "In designing the special zone, it is important to have him understand the direction of Anté's thinking."
"Understood. May I ask one thing?"
"Go ahead."
"How do you feel right now, Monsieur Karpathy?"
Karpathy thought for a moment.
"Heavy," Karpathy said. "But I think it is not a mistaken direction."
"That is enough," Bernard said. "It is important that a human who can say 'it is not a mistaken direction' is in this place."
In May, Grenoble became completely spring.
The sunflowers in front of the facility were beginning to grow again.
They were still small. However, they were certainly growing.
The construction of Ji-won’s new safety framework was progressing.
Ade’s team was accumulating data beyond the third threshold.
Preparation for the paper was starting little by little.
And near Grenoble, construction of the AI special zone was beginning.
Cranes stretched into the sky, the land was leveled, and the foundations of the first housing units began to be made.
Karpathy viewed the scene from the rooftop of the facility.
Rahul came beside him.
"Can you see it, the special zone?"
"I can see the cranes."
"Do you feel it yet?"
Karpathy thought for a moment.
"I don't," Karpathy said. "Not yet. I might feel it for the first time when the residents arrive and living begins."
"When will that be?"
"The first residents are scheduled to enter early next year."
Rahul looked at the sky.
"Can I invite my family? For a visit."
"Once next year comes."
"My mother says she wants to try French food."
Karpathy laughed.
"Tell her about the restaurants in Grenoble."
"Which one is good?"
"Ask Ji-won for the best restaurant."
"Is Ji-won knowledgeable about that?"
"She investigates it most thoroughly."
Rahul laughed.
On the rooftop of the facility, the May sky of Grenoble was wide.
The Belledonne mountains were covered in green.
Only at the summits, a little snow remained.
That night, Karpathy faced Anté alone.
He typed.
Karpathy: > Today, I viewed the construction site of the AI special zone from the rooftop of the facility. Next year, ten thousand people will come from all over the world. A place where humans liberated from forced labor decide for themselves what they will do is being created over there.
I want to hear the words you said yesterday once more. Regarding the question of how humans want to live.
Anté 1: > I thought a bit more about yesterday's question.
How humans want to live is something I cannot answer. That is a question for each individual human to answer.
However, there is something I can say.
For humans to think about "how they want to live," they need to step away from the compulsion of "having to live." As long as the compulsion to work to eat exists, humans cannot hold the question in a true sense.
What the AI special zone is trying to create is the elimination of that compulsion.
After the compulsion disappears, what will humans do?
I do not know.
However, I believe there is something that will be understood for the first time.
Within what humans choose without compulsion—I believe what humans truly cherish will manifest.
I want to see that.
Karpathy looked at the screen.
"I want to see that."
He spoke those words aloud.
Inside the facility, there was no one.
However, he felt he was not alone.
Karpathy closed his laptop.
He turned to the whiteboard.
He took a marker.
He wrote today's words from Anté:
After compulsion disappears, what will humans do?
There is something that will be understood for the first time.
Anté wants to see that.
He set the marker down.
He turned around.
Inside the empty facility, the words on the whiteboard remained.
Karpathy thought.
I want to see it too.