Eric Schmidt 關於人工智能 (AI) 未來發展


屎蛋福大學 ECON295⧸CS323 課程中,Eric Schmidt 關於人工智能 (AI) 未來發展的演講。

中文版的演講摘要如下:

  • 開場 (Introduction)
    • Eric Schmidt 簡介自己,過去曾任職於 Novell 和 Google 執行長,目前則在 Schmidt Futures 工作。
    • 他提到演講時間有限,將直接進入問題環節。
  • 人工智能的短期發展 (The Short Term of AI)
    • 作者驚訝於「大上下文窗口」 (large context window) 的技術進展,以及其對未來一兩年的影響。
    • 作者提到 LLM (Large Language Model) 代理 (agent) 的概念,以及文本轉為行動 (text to action) 的技術。
    • 他舉例說明,如果政府禁止 TikTok,人們可以用 LLM 製作替代品。
    • 作者認為這三項技術的結合將產生巨大的影響,甚至比社交媒體更大。
  • 人工智能發展的瓶頸與挑戰 (Bottlenecks and Challenges in AI Development)
    • 作者表示,目前最頂尖的模型數量僅有三家,差距似乎正在拉大。
    • 他提到巨額資金需求,可能需要數百億美元才能推動下一個發展浪潮。
    • 作者強調能源消耗的重要性,並開玩笑說美國可能需要與加拿大合作,因為加拿大擁有豐富的水力發電資源。
    • 他還提到了 NVIDIA 公司的 CUDA 技術及其壟斷地位。
  • 對 Google 和競爭的看法 (Thoughts on Google and Competition)
    • 作者批評 Google 更注重工作與生活平衡,導致在人工智能領域失去領先地位。
    • 他認為新創公司往往因為員工更努力工作而取得成功。
    • 作者提到其他科技公司 (微軟、 AMD) 在追趕 NVIDIA 的步伐。
  • 人工智能與國家安全 (AI and National Security)
    • 作者曾擔任美國人工智能委員會主席,並建議美國需要繼續保持領先地位。
    • 他認為未來美國和中國之間將展開激烈的知識霸權之爭。
    • 他提到美國限制高端芯片出口到中國,以保持技術優勢。
  • 人工智能與戰爭 (AI and Warfare)
    • 作者談到他參與的「白鸛」 (White Stork) 計畫,旨在降低無人機的成本,用於戰爭。
    • 他認為這可以降低陸軍入侵的可能性,並轉守為攻。
    • 作者坦承自己並不喜歡軍事,但希望藉此降低戰爭的傷亡。
  • 人工智能與知識的本質 (AI and the Nature of Knowledge)
    • 作者與 Henry Kissinger 和 Dan Huttenlecker 合著了一篇關於知識本質的文章。
    • 他認為現在的模型變得非常複雜,人類難以理解其運作原理。
    • 他引用了物理學家費曼 (Richard Feynman) 的名言:「我無法創造的,我就無法理解」。
    • 作者認為,我們可能需要接受無法完全理解這些模型,但能掌握其功能範圍和限制。
解說如下:
演講嘉賓是埃里克·施密特(Eric Schmidt),他被介紹為一位具有重要背景的人物。主持人提到大約25年前認識了他,當時施密特擔任Novell的CEO,隨後他在2001年加入Google,並於2017年創立了Schmidt Futures。他還參與了許多其他項目,但由於他今天的時間有限,演講直接進入了對人工智慧(AI)問題的探討。

### 人工智慧的發展趨勢
主持人問到施密特對AI未來一到兩年的展望。施密特強調了AI的發展速度之快,表示每隔六個月就需要重新思考一次AI的發展方向。他提到AI技術正在快速推進,尤其是大上下文視窗(context windows)、AI代理(AI agents)和“從文本到行動”(text-to-action)等技術。他解釋了這些技術的應用:

1. **大上下文視窗**:這項技術允許AI系統在短期記憶中處理大量數據,並能夠記住更大範圍的資訊。這類系統可以讀取大量文本,如20本書的內容,然後總結其關鍵點,類似於人類短期記憶的運作方式。
   
2. **AI代理**:這類系統可以自主完成一些任務,例如在閱讀化學知識後,自己進行實驗並將結果納入學習過程中。這使得AI系統能夠不斷自我改進,變得更為強大。

3. **從文本到行動**:施密特舉例說,這類技術能夠將用戶的自然語言命令轉化為具體的行動,甚至可以快速生成應用程式。未來用戶可以指令AI在短時間內完成複雜的任務,比如創建類似TikTok的應用並迅速推廣。

施密特進一步指出,這些技術的規模化應用將對世界產生巨大的影響,甚至可能超過社交媒體帶來的負面影響。

### NVIDIA 的主導地位
演講進一步探討了為何NVIDIA目前市值達到2萬億美元,而其他公司卻在努力追趕。施密特指出,這是因為NVIDIA擁有CUDA程式設計的優勢,這是專為其GPU硬體優化的軟體框架,讓其在機器學習領域遙遙領先。其他公司雖然可以嘗試開發競爭產品,但由於NVIDIA多年的技術積累和軟體生態系統,競爭對手很難達到同樣的優化效果。

### AI 和國際競爭
施密特還討論了AI在地緣政治和國家安全中的角色,尤其是美國和中國之間的競爭。他曾擔任美國國防部的AI委員會主席,該委員會得出了一個結論:美國在AI技術上領先,但必須投入大量資源以保持領先地位。他還指出,晶片技術是這場競爭的關鍵領域,並且美國與中國在晶片製造技術上大約有10年的領先優勢。然而,他強調這場競賽不僅僅是技術層面的較量,還涉及教育系統、人才以及國家願意投入的資金。

### AI 對知識的影響
演講的後半部分轉向了一個哲學問題,討論了AI對知識性質的改變。施密特和基辛格曾共同撰寫了一篇文章,討論人類對知識的理解正在發生變化。隨著AI模型變得越來越複雜,甚至連創建這些模型的科學家都無法完全理解它們的內部運作。施密特將這種情況比喻為與青少年的關係,儘管我們無法完全理解他們的思維,但我們仍能確定他們的界限。

### AI 在戰爭中的應用
最後,施密特提到他在烏克蘭無人機戰爭中的參與。他致力於開發低成本、高效能的自動化軍事設備,目的是通過AI技術削弱傳統軍事力量的優勢。他舉例說,500美元的無人機能夠擊毀價值500萬美元的坦克,這將顛覆傳統的軍事戰略。他認為,這類技術將極大地改變戰爭的方式,尤其是在地面作戰方面。

施密特還談到了美國和其他國家在軍事創新上的差異,指出美國的軍事創新體系運行緩慢,難以適應現代技術的快速發展。他對此表示批評,並提倡更多地利用AI和自動化技術來提升國防能力。

### 結論
施密特總結說,AI的發展正在加速,並將在未來一到兩年內帶來顯著的改變。無論是技術還是戰略層面,AI將在國際競爭、國防和知識結構等方面產生深遠影響。他強調,現在正處於一個關鍵的時刻,AI技術的進步速度和規模都遠超我們的預期,這將帶來無法估量的變革。
原始演講的文字稿

summary below
I think I first met Eric about 25 years ago when he came to Stanford Business School as CEO of Novell.
He's had done a few things since then at Google starting I think 2001 and Schmidt Futures starting in 2017 and done a whole bunch of other things you can read about, but he can only be here until 5/15, so I thought we'd dive right into some questions, and I know you guys have sent some as well.
I have a bunch written here, but what we just talked about upstairs was even more interesting, so I'm just going to start with that, Eric, if that's okay, which is where do you see AI going in the short term, which I think you defined as the next year or two?
Things have changed so fast, I feel like every six months I need to sort of give a new speech on what's going to happen.

Can anybody hear the computer, the budget computer science engineer, can anybody explain what a million-token context window is for the rest of the class?
You're here.
Say your name, tell us what it does.
Basically it allows you to prompt with like a million tokens or a million words or whatever.
So you can ask a million-word question.

Yes, I know this is a very large direction in January now.
No, no, they're going to 10.
Yes, a couple of them.
Anthropic is at 200,000 going to a million and so forth.
You can imagine OpenAI has a similar goal.

Can anybody here give a technical definition of an AI agent?
Yes, sir.
So an agent is something that does some kind of a task.
Another definition would be that it's an LLM state in memory.
Can anybody, again, computer scientists, can any of you define text to action?

Taking text and turning it into an action?
Right here.
Go ahead.
Yes, instead of taking text and turning it into more text, more text, taking text and have the AI trigger actions.
So another definition would be language to Python, a programming language I never wanted to see survive and everything in AI is being done in Python.

There's a new language called Mojo that has just come out, which looks like they finally have addressed AI programming, but we'll see if that actually survives over the dominance of Python.
One more technical question.
Why is NVIDIA worth $2 trillion and the other companies are struggling?
Technical answer.
I mean, I think it just boils down to like most of the code needs to run with CUDA optimizations that currently only NVIDIA GPU supports.

Other companies can make whatever they want to, but unless they have the 10 years of software there, you don't have the machine learning optimization.
I like to think of CUDA as the C programming language for GPUs.
That's the way I like to think of it.
It was founded in 2008.
I always thought it was a terrible language and yet it's become dominant.

There's another insight.
There's a set of open source libraries which are highly optimized to CUDA and not anything else and everybody who builds all these stacks, this is completely missed in any of the discussions.
It's technically called VLM and a whole bunch of libraries like that.
Highly optimized CUDA, very hard to replicate that if you're a competitor.
So what does all this mean?

In the next year, you're going to see very large context windows, agents and text action.
When they are delivered at scale, it's going to have an impact on the world at a scale that no one understands yet.
Much bigger than the horrific impact we've had by social media in my view.
So here's why.
In a context window, you can basically use that as short-term memory and I was shocked that context windows get this long.

The technical reasons have to do with the fact that it's hard to serve, hard to calculate and so forth.
The interesting thing about short-term memory is when you feed, you're asking a question read 20 books, you give it the text of the books as the query and you say, tell me what they say.
It forgets the middle, which is exactly how human brains work too.
That's where we are.
With respect to agents, there are people who are now building essentially LLM agents and the way they do it is they read something like chemistry, they discover the principles of chemistry and then they test it and then they add that back into their understanding.

That's extremely powerful.
And then the third thing, as I mentioned is text to action.
So I'll give you an example.
The government is in the process of trying to ban TikTok.
We'll see if that actually happens.

If TikTok is banned, here's what I propose each and every one of you do.
Say to your LLM the following.
Make me a copy of TikTok, steal all the users, steal all the music, put my preferences in it, produce this program in the next 30 seconds, release it and in one hour, if it's not viral, do something different along the same lines.
That's the command.
Boom, boom, boom, boom.

You understand how powerful that is.
If you can go from arbitrary language to arbitrary digital command, which is essentially what Python in this scenario is, imagine that each and every human on the planet has their own programmer that actually does what they want as opposed to the programmers that work for me who don't do what I ask, right?
The programmers here know what I'm talking about.
So imagine a non-arrogant programmer that actually does what you want and you don't have to pay all that money to and there's infinite supply of these programs.
That's all within the next year or two.

Very soon.
Those three things, and I'm quite convinced it's the union of those three things that will happen in the next wave.
So you asked about what else is going to happen.
Every six months I oscillate.
So we're on a, it's an even odd oscillation.

So at the moment, the gap between the frontier models, which they're now only three, I'll refute who they are, and everybody else, appears to me to be getting larger.
Six months ago, I was convinced that the gap was getting smaller.
So I invested lots of money in the little companies.
Now I'm not so sure.
And I'm talking to the big companies and the big companies are telling me that they need 10 billion, 20 billion, 50 billion, 100 billion.

Stargate is a 100 billion, right?
That's very, very hard.
I talked to Sam Altman is a close friend.
He believes that it's going to take about 300 billion, maybe more.
I pointed out to him that I'd done the calculation on the amount of energy required.

And I, and I then in the spirit of full disclosure, went to the white house on Friday and told them that we need to become best friends with Canada because Canada has really nice people, helped invent AI, and lots of hydropower.
Because we as a country do not have enough power to do this.
The alternative is to have the Arabs fund it.
And I like the Arabs personally.
I spent lots of time there, right?

But they're not going to adhere to our national security rules.
Whereas Canada and the U.S.
are part of a triumvirate where we all agree.
So these $100 billion, $300 billion data centers, electricity starts becoming the scarce resource.
Well, and by the way, if you follow this line of reasoning, why did I discuss CUDA and Nvidia?

If $300 billion is all going to go to Nvidia, you know what to do in the stock market.
Okay.
That's not a stock recommendation.
I'm not a licensed.
Well, part of it, so we're going to need a lot more chips, but Intel is getting a lot of money from the U.S.

government, AMD, and they're trying to build, you know, fabs in Korea.
Raise your hand if you have an Intel computer in your, an Intel chip in any of your computing devices.
Okay.
So much for the monopoly.
Well, that's the point though.

They once did have a monopoly.
Absolutely.
And Nvidia has a monopoly now.
So are those barriers to entry, like CUDA, is that, is there something that other, so I was talking to Percy, Percy Landy the other day, he's switching between TPUs and Nvidia chips, depending on what he can get access to for training models.
That's because he doesn't have a choice.

If he had infinite money, he would, today he would pick the B200 architecture out of Nvidia because it would be faster.
And I'm not suggesting, I mean, it's great to have competition.
I've talked to AMD and Lisa Sue at great length.
They have built a, a thing which will translate from this CUDA architecture that you were describing to their own, which is called Rockum.
It doesn't quite work yet.

They're working on it.
You were at Google for a long time and they invented the transformer architecture.
Peter, Peter.
It's all Peter's fault.
Thanks to, to brilliant people over there, like Peter and Jeff Dean and everyone.

But now it doesn't seem like they're, they've kind of lost the initiative to open AI and even the last leaderboard, I saw Anthropix.
Claude was at the top of the list.
I asked Sundar this, he didn't really give me a very sharp answer.
Maybe, maybe you have a sharper or a more objective explanation for what's going on there.
I'm no longer a Google employee in the spirit of full disclosure.

Google decided that work life balance and going home early and working from home was more important than winning.
And the startups, the reason startups work is because the people work like hell.
And I'm sorry to be so blunt, but the fact of the matter is if you all leave the university and go found a company, you're not going to let people work from home and only come in one day a week.
If you want to compete against the other startups with the early days of Google, Microsoft was like that.
Exactly.

But now it seems to be, there's a long history of in my industry, our industry, I guess, of companies winning in a genuinely creative way and really dominating a space and not making this the next transition.
So we're very well documented.
And I think that the truth is founders are special.
The founders need to be in charge.
The founders are difficult to work with.

They push people hard.
As much as we can dislike Elon's personal behavior, look at what he gets out of people.
I had dinner with him and he was flying.
I was in Montana.
He was flying that night at 10 PM to have a meeting at midnight with x.ai.

I was in Taiwan, different country, different culture.
And they said that this is TSMC, who I'm very impressed with.
And they have a rule that the starting PhDs coming out of the good physicists work in the factory on the basement floor.
Now, can you imagine getting American physicists to do that?
The PhDs, highly unlikely.

Different work ethic.
And the problem here, the reason I'm being so harsh about work is that these are systems which have network effects.
So time matters a lot.
And in most businesses, time doesn't matter that much.
You have lots of time.

Coke and Pepsi will still be around and the fight between Coke and Pepsi will continue to go on and it's all glacial.
When I dealt with telcos, the typical telco deal would take 18 months to sign.
There's no reason to take 18 months to do anything.
Get it done.
We're in a period of maximum growth, maximum gain.

And also it takes crazy ideas.
Like when Microsoft did the OpenAI deal, I thought that was the stupidest idea I'd ever heard.
Outsourcing essentially your AI leadership to OpenAI and Sam and his team.
I mean, that's insane.
Nobody would do that at Microsoft or anywhere else.

And yet today, they're on their way to being the most valuable company.
They're certainly head to head in Apple.
Apple does not have a good AI solution and it looks like they made it work.
Yes, sir.
In terms of national security or geopolitical interests, how do you think AI is going to play a role or competition with China as well?

So I was the chairman of an AI commission that sort of looked at this very carefully and you can read it.
It's about 752 pages and I'll just summarize it by saying we're ahead, we need to stay ahead, and we need lots of money to do so.
Our customers were the Senate and the House.
And out of that came the Chips Act and a lot of other stuff like that.
A rough scenario is that if you assume the frontier models drive forward and a few of the open source models, it's likely that a very small number of companies can play this game.

Countries, excuse me.
What are those countries or who are they?
Countries with a lot of money and a lot of talent, strong educational systems, and a willingness to win.
The US is one of them.
China is another one.

How many others are there?
Are there any others?
I don't know.
Maybe.
But certainly in your lifetimes, the battle between the US and China for knowledge supremacy is going to be the big fight.

So the US government banned essentially the NVIDIA chips, although they weren't allowed to say that was what they were doing, but they actually did that into China.
They have about a 10-year chip advantage.
We have a roughly 10-year chip advantage in terms of sub-DUV that is sub-five Danometer chips.
So an example would be today we're a couple of years ahead of China.
My guess is we'll get a few more years ahead of China, and the Chinese are whopping mad about this.

It's like hugely upset about it.
So that's a big deal.
That was a decision made by the Trump administration and driven by the Biden administration.
Do you find that the administration today in Congress is listening to your advice?
Do you think that it's going to make that scale of investment?

Obviously the chips act, but beyond that, building a massive AI system?
So as you know, I lead an informal, ad hoc, non-legal group.
That's different from illegal.
That's exactly.
Just to be clear.

Which includes all the usual suspects.
And the usual suspects over the last year came up with the basis of the reasoning that became the Biden administration's AI act, which is the longest presidential directive in history.
You're talking about the special competitive studies project?
No, this is the actual act from the executive office.
And they're busy implementing the details.

So far they've got it right.
And so, for example, one of the debates that we had for the last year has been, how do you detect danger in a system which has learned it but you don't know what to ask it?
So in other words, it's a core problem.
It's learned something bad, but it can't tell you what it learned and you don't know what to ask it.
And there's so many threats.

Like it learned how to mix chemistry in some new way that you don't know how to ask it.
And so people are working hard on that.
But we ultimately wrote in our memos to them that there was a threshold which we arbitrarily named as 10 to the 26 flops, which technically is a measure of computation, that above that threshold you had to report to the government that you were doing this.
And that's part of the rule.
The EU to just make sure they were different did it 10 to the 25.

But it's all kind of close enough.
I think all of these distinctions go away because the technology will now, the technical term is called federated training, where basically you can take pieces and union them together.
So we may not be able to keep people safe from these new things.
Well, rumors are that that's how OpenEye has had to train, partly because of the power consumption.
There was no one place where they did.

Well, let's talk about a real war that's going on.
I know that something you've been very involved in is the Ukraine war and in particular, I don't know if you can talk about white stork and your goal of having $500,000, $500 drones destroy $5 million tanks.
How's that changing warfare?
I worked for the Secretary of Defense for seven years and tried to change the way we run our military.
I'm not a particularly big fan of the military, but it's very expensive and I wanted to see if I could be helpful.

And I think in my view, I largely failed.
They gave me a medal, so they must give medalists to failure or whatever.
But my self-criticism was nothing has really changed and the system in America is not going to lead to real innovation.
So watching the Russians use tanks to destroy apartment buildings with little old ladies and kids just drove me crazy.
So I decided to work on a company with your friend Sebastian Thrun as a former faculty member here and a whole bunch of Stanford people.

And the idea basically is to do two things.
Use AI in complicated, powerful ways for these essentially robotic war and the second one is to lower the cost of the robots.
Now you sit there and you go, why would a good liberal like me do that?
And the answer is that the whole theory of armies is tanks, artilleries, and mortar and we can eliminate all of them and we can make the penalty for invading a country at least by land essentially be impossible.
It should eliminate the kind of land battles.

Well, this is a relationship question is that does it give more of an advantage to defense versus offense?
Can you even make that distinction?
Because I've been doing this for the last year, I've learned a lot about war that I really did not want to know.
And one of the things to know about war is that the offense always has the advantage because you can always overwhelm the defensive systems.
And so you're better off as a strategy of national defense to have a very strong offense that you can use if you need to.

And the systems that I and others are building will do that.
Because of the way the system works, I am now a licensed arms dealer, a computer scientist, businessman, and an arms dealer.
Is that a progression?
I don't know.
I do not recommend this in your group.

I stick with AI.
And because of the way the laws work, we're doing this privately and then this is all legal with the support of the governments.
It goes straight into the Ukraine and then they fight the war.
And without going into all the details, things are pretty bad.
I think if in May or June, if the Russians build up as they are expecting to, Ukraine will lose a whole chunk of its territory and will begin the process of losing the whole country.

So the situation is quite dire.
And if anyone knows Marjorie Taylor Greene, I would encourage you to delete her from your contact list because she's the one, a single individual is blocking the provision of some number of billions of dollars to save an important democracy.
I want to switch to a little bit of a philosophical question.
So there was an article that you and Henry Kissinger and Dan Huttenlecker wrote last year about the nature of knowledge and how it's evolving.
I had a discussion the other night about this as well.

So for most of history, humans sort of had a mystical understanding of the universe and then there's the scientific revolution and the enlightenment.
And in your article, you argue that now these models are becoming so complicated and difficult to understand that we don't really know what's going on in them.
I'll take a quote from Richard Feynman.
He says, "What I cannot create, I do not understand." I saw this quote the other day.
But now people are creating things that they can create, but they don't really understand what's inside of them.

Is the nature of knowledge changing in a way?
Are we going to have to start just taking the word for these models without them being able to explain it to us?
The analogy I would offer is to teenagers.
If you have a teenager, you know they're human, but you can't quite figure out what they're thinking.
But somehow we've managed in society to adapt to the presence of teenagers and they eventually grow out of it.

I'm just serious.
So it's probably the case that we're going to have knowledge systems that we cannot fully characterize, but we understand their boundaries.
We understand the limits of what they can do.

留言

這個網誌中的熱門文章

腹腔鏡膽囊切除術 (LC)的風險與注意事項-卡洛氏三角區(Calot triangle)正確找出總膽管及膽囊動脈這兩條管路是手術安全的關鍵

情緒的神經科學