來源:數(shù)據(jù)觀 時(shí)間:2019-08-20 15:27:07 作者:Will Knight
The world’s top deepfake artist is wrestling with the monster he created
數(shù)據(jù)觀丨王婕(譯)
Hao Li has spent his career perfecting digital trickery. Now he’s working to confront the problem of increasingly seamless off-the-shelf deception.
黎顥的職業(yè)生涯一直致力于完善數(shù)字特技。目前,他正在努力解決日趨以假亂真的欺詐問題。
?
The author got a digital facelift, making it look as though Elon Musk were the one doing the talking.
VIDEO EDITED BY HAO LI.
發(fā)布者做了一個(gè)“數(shù)字整容手術(shù)”,看起來就好像是埃隆·馬斯克(SpaceX CEO兼CTO、特斯拉公司首席執(zhí)行官、SolarCity董事會(huì)主席)在說話。視頻中,“埃隆·馬斯克”表示他正在競(jìng)選美國(guó)總統(tǒng),特斯拉正在研發(fā)“空中飛車”,他甚至還打算在新公司用自己的大腦來做實(shí)驗(yàn)……每一個(gè)信息都堪稱“爆點(diǎn)”,極易讓人信以為真。不過發(fā)布者也在結(jié)尾表示,他實(shí)際上是想通過這個(gè)視頻向人們揭示“深度換臉”有多復(fù)雜。(視頻由黎顥編輯)
It’s June in Dalian, China, a city on a peninsula that sticks out into the Yellow Sea a few hundred miles from Beijing in one direction and from the North Korean border in the other. Hao Li is standing inside a cavernous, angular building that might easily be a Bond villain’s lair. Outside, the weather is sweltering, and security is tight. The World Economic Forum’s annual conference is in town.
6月,中國(guó)大連。在這個(gè)地處黃海之濱,東距北京幾百英里,西鄰朝鮮隔海相望的城市,黎顥正站在一個(gè)殊形詭制、繡闥雕甍的建筑中,宛如《007》系列電影中反派盤踞的巢穴。外面的天氣炙熱難耐,嚴(yán)密的安檢隨之進(jìn)行——世界經(jīng)濟(jì)論壇新領(lǐng)軍者年會(huì)正在這里舉行。
Near Li, politicians and CEOs from around the world take turns stepping into a booth. Inside, they laugh as their face is transformed into that of a famous person: Bruce Lee, Neil Armstrong, or Audrey Hepburn. The trick happens in real time, and it works almost flawlessly.
在黎顥的身旁,來自世界各地的政要和CEO們紛紛走向同一個(gè)展位。在那里,他們搖身一變,臉被切換成李小龍、尼爾·阿姆斯特朗,亦或奧黛麗·赫本等名人,這一幕使人開懷大笑。這種小把戲可以實(shí)時(shí)生成,且?guī)缀跻约賮y真。
The remarkable face-swapping machine wasn’t set up merely to divert and amuse the world’s rich and powerful. Li wants these powerful people to consider the consequences that videos doctored with AI—“deepfakes”—could have for them, and for the rest of us.
這臺(tái)備受矚目的“換臉機(jī)”可不僅是為了娛樂達(dá)官顯貴而問世的。黎顥希望這些具有影響力的人能夠意識(shí)到這類被人工智能篡改的視頻——“深度換臉”(目前這項(xiàng)技術(shù)在國(guó)內(nèi)是封禁狀態(tài))——可能給他們和我們其他人帶來的后果。
Misinformation has long been a popular tool of geopolitical sabotage, but social media has injected rocket fuel into the spread of fake news. When fake video footage is as easy to make as fake news articles, it is a virtual guarantee that it will be weaponized. Want to sway an election, ruin the career and reputation of an enemy, or spark ethnic violence? It’s hard to imagine a more effective vehicle than a clip that?looks?authentic, spreading like wildfire through Facebook, WhatsApp, or Twitter, faster than people can figure out they’ve been duped.
長(zhǎng)期以來,虛假信息都是地緣政治破壞活動(dòng)的慣用手段,尤其是社交媒體的存在使得虛假新聞的傳播可謂如虎添翼。當(dāng)偽視頻像假新聞一樣容易炮制時(shí),無異于將其武器化。想要左右一場(chǎng)選舉、毀掉對(duì)手的事業(yè)和聲譽(yù),或者引發(fā)種族暴力……很難想象有比一段逼真的視頻更有效的傳播工具了,它將像野火一樣在Facebook、WhatsApp或Twitter(等一切社交媒體平臺(tái))上火速傳播,以至于讓人們根本無法意識(shí)到自己已經(jīng)上當(dāng)。
As a pioneer of digital fakery, Li worries that deepfakes are only the beginning. Despite having helped usher in an era when our eyes cannot always be trusted, he wants to use his skills to do something about the looming problem of ubiquitous, near-perfect video deception.
作為數(shù)字造假的先驅(qū),黎顥擔(dān)心“深度換臉”僅僅是個(gè)開始。盡管他參與開啟了一個(gè)“眼見并不一定為實(shí)”的時(shí)代,但他想利用自己的技能來解決無處不在、近乎完美的視頻騙局這一迫在眉睫的問題。
The question is, might it already be too late?
問題是,這一舉動(dòng)會(huì)不會(huì)已經(jīng)太晚?
Rewriting reality改寫現(xiàn)實(shí)
Li isn’t your typical deepfaker. He doesn’t lurk on Reddit posting fake pornor reshoots of famous movies modified to star Nicolas Cage. He’s spent his career developing cutting-edge techniques to forge faces more easily and convincingly. He has also messed with some of the most famous faces in the world for modern blockbusters, fooling millions of people into believing in a smile or a wink that was never actually there. Talking over Skype from his office in Los Angeles one afternoon, he casually mentions that Will Smith stopped in recently, for a movie he’s working on.
黎顥并不像其他典型的深度偽造者。他不會(huì)潛伏在Reddit (全球最受歡迎的討論網(wǎng)站)上發(fā)布由尼古拉斯·凱奇主演的著名電影“翻拍”而成的色情片。在黎顥的職業(yè)生涯中,他一直在發(fā)展尖端技術(shù),以求偽造人臉更簡(jiǎn)單、更逼真。他還在許多現(xiàn)代大片中篡改了一些世界上的著名面孔,讓數(shù)百萬人對(duì)根本不存在的一個(gè)微笑或一個(gè)眨眼信以為真。一天下午,他在洛杉磯的辦公室里通過Skype聊天時(shí),不經(jīng)意間提到威爾·史密斯(美國(guó)演員、歌手)最近過來探訪了他正在拍的一部電影。
Actors often come to Li’s lab at the University of Southern California (USC) to have their likeness digitally scanned. They are put inside a spherical array of lights and machine vision cameras to capture the shape of their face, facial expressions, and skin tone and texture down to the level of individual pores. A special-effects team working on a movie can then manipulate scenes that have already been shot, or even add an actor to a new one in post-production.
演員們經(jīng)常到黎顥所在的南加州大學(xué)(USC)實(shí)驗(yàn)室進(jìn)行肖像數(shù)碼掃描。他們被安排在一個(gè)由燈光和機(jī)器視覺攝像機(jī)組成的球形陣列中,以捕捉它們的臉型、面部表情、膚色和紋理,直至單個(gè)毛孔的層次。隨后制作電影的特效團(tuán)隊(duì)便可以處理已經(jīng)拍攝好的場(chǎng)景,甚至可以在后期制作中添加一個(gè)演員。
圖片由黎顥提供
Such digital deception is now common in big-budget movies. Backgrounds are often rendered digitally, and it’s common for an actor’s face to be pasted onto a stunt person’s in an action scene. That’s led to some breathtaking moments for moviegoers, as when a teenage Princess Leia briefly appearedat the end of Rogue One: A Star Wars Story, even though the actress who had played Leia, Carrie Fisher, was nearly 60 when the movie was shot.
如今,這種“數(shù)字障眼法”在大制作的電影中非常常見。畫面背景通常由數(shù)碼渲染,在動(dòng)作場(chǎng)景中,演員的臉被移植到特技演員的臉上也已經(jīng)見怪不怪。這為觀眾帶來了許多激動(dòng)人心的時(shí)刻,比如在《星球大戰(zhàn)外傳:俠盜一號(hào)》的結(jié)尾,十幾歲的萊婭公主曾短暫出現(xiàn),盡管飾演萊婭公主的女演員凱麗·費(fèi)雪在電影拍攝時(shí)已年近60歲。
Making these effects look good normally requires significant expertise and millions of dollars. But thanks to advances in artificial intelligence, it is now almost trivial to swap two faces in a video, using nothing more powerful than a laptop. With a little extra knowhow, you can make a politician, a CEO, or a personal enemy say or do anything you want (as in the video at the top of the story, in which Li mapped Elon Musk's likeness onto my face).
想要讓這些特效看起來不錯(cuò),通常需要大量的專業(yè)知識(shí)和數(shù)百萬美元。但由于人工智能技術(shù)的進(jìn)步,如今要在一個(gè)視頻中交換兩張臉?biāo)冻龅拇鷥r(jià)幾乎變得微不足道,只需要一臺(tái)筆記本電腦就可以實(shí)現(xiàn)。只要掌握一點(diǎn)額外的技能,你就能讓政客、CEO或你的仇敵說出或做到任何你想讓他們做的任何事情(就像故事開頭的視頻中,黎顥在我的臉上映射了埃隆?馬斯克的肖像一樣)。
A history of trickery“深度換臉”的前世今生
In person, Li looks more cyberpunk than Sunset Strip. His hair is shaved into a Mohawk that flops down on one side, and he often wears a black T-shirt and leather jacket. When speaking, he has an odd habit of blinking in a way that betrays late nights spent in the warm glow of a computer screen. He isn’t shy about touting the brilliance of his tech, or what he has in the works. During conversations, he likes to whip out a smartphone to show you something new.
就外貌來看,黎顥本人看起來更偏賽博朋克風(fēng)格(cyberpunk,又稱數(shù)字朋克),而非日落大道風(fēng)。他把頭發(fā)剃成莫西干式,垂向一邊,經(jīng)常穿著黑色t恤和皮夾克,說話時(shí)有一個(gè)頗為奇怪的眨眼習(xí)慣,這暴露了他深夜在電腦屏幕前挑燈夜戰(zhàn)的習(xí)慣。他并不羞于展現(xiàn)自己的高超技術(shù),他的作品鋒芒畢露。在交談中,他喜歡拿出智能手機(jī)給你看一些新鮮玩意。
圖片由黎顥提供
Li grew up in Saarbrücken, Germany, the son of Taiwanese immigrants. He attended a French-German high school and learned to speak four languages fluently (French, German, English, and Mandarin). He remembers the moment that he decided to spend his time blurring the line between reality and fantasy. It was 1993, when he saw a huge dinosaur lumber into view in Steven Spielberg’s Jurassic Park. As the actors gawped at the computer-generated beast, Li, then 12, grasped what technology had just made possible. “I realized you could now basically create anything, even things that don’t even exist,” he recalls.
黎顥在德國(guó)薩爾布魯肯長(zhǎng)大,父親是臺(tái)灣移民,他在法德兩國(guó)合辦的一所高中學(xué)習(xí)了四門語言(法語、德語、英語和普通話)。時(shí)至今日,他猶記得自己決定投入畢生精力來模糊現(xiàn)實(shí)和幻想之間界限的那一刻——那是1993年,他在史蒂文·斯皮爾伯格執(zhí)導(dǎo)的《侏羅紀(jì)公園》中看到一塊巨大的恐龍化石,當(dāng)演員們呆呆地看著這只電腦生成的怪獸時(shí),年僅12歲的黎顥在那一刻明白了科技讓一切成為可能?!拔乙庾R(shí)到我們現(xiàn)在基本上可以創(chuàng)造任何東西,甚至是不存在的東西”,他回憶道。
Li got his PhD at ETH Zurich, a prestigious technical university in Switzerland, where one of his advisors remembers him as both a brilliant student and an incorrigible prankster. Videos accompanying academic papers sometimes included less-than-flattering caricatures of his teachers.
黎顥在蘇黎世聯(lián)邦理工學(xué)院獲得博士學(xué)位,這是瑞士一所著名的技術(shù)類大學(xué),他的一位導(dǎo)師記得他是一個(gè)聰明的學(xué)生,也是一個(gè)不可救藥的惡作劇者。他的學(xué)術(shù)論文附帶的視頻有時(shí)甚至包含對(duì)他老師們不太恭維的諷刺。
Paul Walker's brothers provided a template for his digital likeness in Furious 7.WETA DIGITAL
保羅·沃克的兄弟們?yōu)樗凇端俣扰c激情7》中的數(shù)碼形象提供了模板。圖片來自維塔數(shù)字公司
Shortly after joining USC, Li created facial tracking technology used to make a digital version of the late actor Paul Walker for the action movie Furious 7. It was a big achievement, since Walker, who died in a car accident halfway through shooting, had not been scanned beforehand, and his character needed to appear in so many scenes. Li’s technology was used to paste Walker’s face onto the bodies of his two brothers, who took turns acting in his place in more than 200 scenes.
進(jìn)入南加州大學(xué)后不久,黎顥發(fā)明了一種面部跟蹤技術(shù),用于制作動(dòng)作電影《速度與激情7》中已故演員保羅·沃克的數(shù)碼版本。這無疑是一項(xiàng)巨大的成就,因?yàn)槲挚嗽谥暗呐臄z過程中死于一場(chǎng)車禍,片方事先并沒有對(duì)他進(jìn)行肖像掃描,而他的角色需要出現(xiàn)在大量場(chǎng)景中。黎顥發(fā)明的這一技術(shù)被用來把沃克的臉粘貼到他的兩個(gè)兄弟身上,他們?cè)诔^200個(gè)場(chǎng)景中輪流扮演沃克。
The movie, which grossed $1.5 billion at the box office, was the first to depend so heavily on a digitally re-created star. Li mentions Walker’s virtual role when talking about how good video trickery is becoming. “Even I can’t tell which ones are fake,” he says with a shake of his head.
這部電影的票房收入最終高達(dá)15億美元,這也是第一部如此倚重“數(shù)字再造明星”的電影。黎顥在談到如今視頻騙術(shù)變得越來越精湛時(shí)提到了沃克的虛擬角色,“連我都分不清哪些是假的,”他搖搖頭說。
Virtually you虛擬如你
In 2009, less than a decade before deepfakes emerged, Li developed a way to capture a person’s face in real time and use it to operate a virtual puppet. This involved using the latest depth sensors and new software to map that face, and its expressions, to a mask made of deformable virtual material.
2009年,在“深度換臉”出現(xiàn)不到10年之前,黎顥發(fā)明了一種方法,可以實(shí)時(shí)捕捉一個(gè)人的臉,并用它來操縱一個(gè)虛擬木偶。這包括使用最新的深度傳感器和新軟件將人臉及其表情映射到由可變形的虛擬材料制成的面具上。
Most important, the approach worked without the need to add dozens of motion-tracking markers to a person’s face, a standard industry technique for tracking face movement. Li contributed to the development of software called Faceshift, which would later be commercialized as a university spinoff. The company was acquired by Apple in 2015, and its technology was used to create the Animoji software that lets you turn yourself into a unicorn or a talking pile of poop on the latest iPhones.
最重要的是,即使不在一個(gè)人的臉上添加幾十個(gè)運(yùn)動(dòng)跟蹤標(biāo)記,這種方法也依然有效,這是一種用于跟蹤面部運(yùn)動(dòng)的標(biāo)準(zhǔn)行業(yè)技術(shù)。黎顥為開發(fā)一款名為Faceshift的軟件做出了貢獻(xiàn),該軟件隨后將作為大學(xué)附屬產(chǎn)品進(jìn)行商業(yè)化。該公司于2015年被蘋果收購(gòu),它的技術(shù)被用來創(chuàng)建Animoji軟件,讓你能夠在最新的iPhone上變身獨(dú)角獸或一坨會(huì)說話的便便。
?
An example of marker-based face tracking.FACEWARE TECHNOLOGIES
一個(gè)基于標(biāo)記的人臉跟蹤的例子。來自Faceware Technologies(無標(biāo)記三維面部表情捕捉解決方案領(lǐng)先供應(yīng)商)
Li and his students have published dozens of papers on such topics as avatars that mirror whole body movements, highly realistic virtual hair, and simulated skin that stretches the way real skin does. In recent years, his group has drawn on advances in machine learning and especially deep learning, a way of training computers to do things using a large simulated neural network. His research has also been applied to medicine, helping develop ways of tracking tumors inside the body and modeling the properties of bones and tissue.
黎顥和他的學(xué)生們已經(jīng)發(fā)表了幾十篇論文,主題涉及能夠反映整個(gè)身體動(dòng)作的化身、高度逼真的虛擬頭發(fā)以及能夠像真實(shí)皮膚一樣伸展的模擬皮膚。近年來,他的團(tuán)隊(duì)利用了機(jī)器學(xué)習(xí),尤其是深度學(xué)習(xí)的進(jìn)步。深度學(xué)習(xí)是一種利用大型模擬神經(jīng)網(wǎng)絡(luò)訓(xùn)練計(jì)算機(jī)做事的方法。他的研究也被應(yīng)用到醫(yī)學(xué)上,幫助開發(fā)追蹤體內(nèi)腫瘤和模擬骨骼和組織特性的方法。
Today, Li splits his time between teaching, consulting for movie studios, and running a new startup, Pinscreen. The company uses more advanced AI than is behind deepfakes to make virtual avatars. Its app turns a single photo into a photorealistic 3D avatar in a few seconds. It employs machine-learning algorithms that have been trained to map the appearance of a face onto a 3D model using many thousands of still images and corresponding 3D scans. The process is improved using what are known as generative adversarial networks, or GANs (which are not used for most deepfakes). This means having one algorithm produce fake images while another judges whether they are fake, a process that gradually improves the fakery. You can have your avatar perform silly dances and try on different outfits, and you can control the avatar’s facial expressions in real time, using your own face via the camera on your smartphone.
如今,黎顥在教學(xué)、電影咨詢以及經(jīng)營(yíng)一家名為Pinscreen的新公司之間奔波。該公司使用比“深度換臉”技術(shù)更先進(jìn)的人工智能制作虛擬化身。它的應(yīng)用程序可以在幾秒鐘內(nèi)將一張照片變成逼真的3D頭像。它采用了經(jīng)過訓(xùn)練的機(jī)器學(xué)習(xí)算法,通過對(duì)數(shù)千張靜止圖像進(jìn)行掃描并形成相應(yīng)的3D頭像的一系列訓(xùn)練,最終實(shí)現(xiàn)將人臉的外觀映射到3D模型上。這個(gè)過程通過所謂的生成式對(duì)抗網(wǎng)絡(luò)(GAN)得到了改進(jìn)(生成式對(duì)抗網(wǎng)絡(luò)并不用于大多數(shù)deepfakes技術(shù)中)。這意味著一種算法生成假圖像,而另一種算法判斷圖像是否為假,這一過程將逐步改進(jìn)偽造的可信度。你可以讓你的虛擬化身表演愚蠢的舞蹈,試穿不同的服裝,你還可以通過智能手機(jī)上的攝像頭實(shí)時(shí)控制化身的面部表情。
A former employee, Iman Sadeghi, is suing Pinscreen, alleging it faked a presentation of the technology at the the SIGGRAPH conference in 2017. MIT Technology Review has seen letters from several experts and SIGGRAPH organizers dismissing those claims.
Pinscreen的前雇員伊曼·薩德吉正對(duì)該公司提起訴訟,聲稱其在2017年的Siggraph會(huì)議上偽造了該技術(shù)的演示文稿?!堵槭±砉W(xué)院技術(shù)評(píng)論》(MIT Technology Review)曾收到幾位專家和Siggraph組織者的來信,他們駁斥了這些說法。
Pinscreen is working with several big-name clothing retailers that see its technology as a way to let people try garments on without having to visit a physical store. The technology could also be big for videoconferencing, virtual reality, and gaming. Just imagine a Fortnite character that not only looks like you, but also laughs and dances the same way.
目前,Pinscreen公司正在與幾家知名服裝零售商合作,他們將Pinscreen的技術(shù)視為一種能夠讓人們實(shí)現(xiàn)不用去實(shí)體店就能試穿衣服的契機(jī),這項(xiàng)技術(shù)在視頻會(huì)議、虛擬現(xiàn)實(shí)和游戲領(lǐng)域也將大有作為。想象一下《堡壘之夜》(一款游戲)中的某個(gè)角色,它不僅長(zhǎng)得像你,而且還以同樣的方式歡笑和跳舞,是不是很有趣呀?
Avatars made using the Pin Screen app.COURTESY OF HAO LI
使用Pin Screen app制作的虛擬化身。圖片由黎顥提供
Underneath the digital silliness, though, is an important trend: AI is rapidly making advanced image manipulation the province of the smartphone rather than the desktop. FaceApp, developed by a company in Saint Petersburg, Russia, has drawn millions of users, and recent controversy, by offering a one-click way to change a face on your phone. You can add a smile to a photo, remove blemishes, or mess with your age or gender (or someone else’s). Dozens more apps offer similar manipulations at the click of a button.
然而,在數(shù)字游戲的背后呈現(xiàn)了這樣一個(gè)重要的趨勢(shì):智能手機(jī)正成為人工智能高級(jí)圖像處理的主要戰(zhàn)場(chǎng),而非電腦。由俄羅斯圣彼得堡一家公司開發(fā)的軟件——FaceApp,通過提供一鍵式的手機(jī)換臉功能,在吸引數(shù)百萬用戶的同時(shí),最近也引發(fā)了不少爭(zhēng)議。你可以在照片上添加微笑、去除瑕疵,打亂你或其他人的年齡或性別,更多的應(yīng)用程序提供了類似的操作,只需點(diǎn)擊一個(gè)按鈕。
Not everyone is excited about the prospect of this technology becoming ubiquitous. Li and others are “basically trying to make one-image, mobile, and real-time deepfakes,” says Sam Gregory, director of Witness, a nonprofit focused on video and human rights. “That’s the threat level that worries me, when it [becomes] something that’s less easily controlled and more accessible to a range of actors.”
并不是每個(gè)人都對(duì)這項(xiàng)技術(shù)的普及前景感到興奮。專注于視頻和人權(quán)的非營(yíng)利組織Witness的主管薩姆?格雷戈里表示,黎顥和其他人“基本上是在嘗試制作一幅移動(dòng)的、實(shí)時(shí)的深度偽造圖片”,“這就是我擔(dān)心的威脅級(jí)別,當(dāng)它(變成)一種不太容易控制且更容易被各種各樣的參與者所接受的東西時(shí)?!?/p>
Fortunately, most deepfakes still look a bit off. A flickering face, a wonky eye, or an odd skin tone make them easy enough to spot. But just as an expert can remove such flaws, advances in AI promise to smooth them out automatically, making the fake videos both simpler to create and harder to detect.
幸運(yùn)的是,大多數(shù)“深度換臉”看起來還是有點(diǎn)不對(duì)勁:一張閃爍的臉,一只不穩(wěn)定的眼睛,或者一種奇怪的膚色,使它們很容易被察覺。但是,正如專家可以消除這些缺陷一樣,人工智能的進(jìn)步勢(shì)必將自動(dòng)消除這些缺陷,使得虛假視頻更容易被創(chuàng)建且更難被檢測(cè)識(shí)別。
Even as Li races ahead with digital fakery, he is also troubled by the potential for harm. “We’re sitting in front of a problem,” he says.
盡管黎顥在數(shù)字造假方面遙遙領(lǐng)先,但他也擔(dān)心潛在的危害?!拔覀冋媾R著一個(gè)嚴(yán)峻的問題,”他表示。
Catching imposters堵截罪魁禍?zhǔn)?/font>
US policymakers are especially concerned about how deepfakes might be used to spread more convincing fake news and misinformation ahead of next year’s presidential election. Earlier this month, the House Intelligence Committee asked Facebook, Google, and Twitter how they planned to deal with the threat of deepfakes. Each company said it was working on the problem, but none offered a solution.
美國(guó)的政策制定者尤其擔(dān)心在明年的總統(tǒng)大選之前,“深度換臉”可能被用來傳播更具說服力的假新聞和錯(cuò)誤信息。本月早些時(shí)候,眾議院情報(bào)委員會(huì)就將如何應(yīng)對(duì)“深度換臉”的威脅詢問了Facebook、Google和Twitter,雖然他們都表示正在解決這個(gè)問題,但都沒有提供解決方案。
DARPA, the US military’s well-funded research agency, is also worried about the rise of digital manipulation. In 2016, before deepfakes became a thing, DARPA launched a program called Media Forensics, or MediFor, to encourage digital forensics experts to develop automated tools for catching manipulated imagery. A human expert might use a range of methods to spot photographic forgeries, from analyzing inconsistencies in a file’s data or the characteristics of specific pixels to hunting for physical inconsistencies such as a misplaced shadow or an improbable angle.
美國(guó)國(guó)防部高級(jí)研究計(jì)劃局(DARPA)是美國(guó)軍方資金充裕的研究機(jī)構(gòu),它也對(duì)數(shù)字操縱的興起感到擔(dān)憂。2016年,在“深度換臉”引起廣泛關(guān)注之前,DARPA啟動(dòng)了一個(gè)名為媒體取證(Media Forensics,簡(jiǎn)稱MediFor)的項(xiàng)目,鼓勵(lì)數(shù)字取證專家開發(fā)用于捕獲被操縱圖像的自動(dòng)化工具。人類專家可能會(huì)使用一系列方法來識(shí)別照片造假,從分析文件數(shù)據(jù)的不一致性或特定像素的特征,到尋找物理上的不一致性,如錯(cuò)位的陰影或不可能的角度。
MediFor is now largely focused on spotting deepfakes. Detection is fundamentally harder than creation because AI algorithms can learn to hide things that give fakes away. Early deepfake detection methods include tracking unnatural blinking and weird lip movements. But the latest deepfakes have already learned to automatically smooth out such glitches.
MediFor現(xiàn)在主要專注于識(shí)破“深度換臉”。檢測(cè)從根本上來說比創(chuàng)建要困難,因?yàn)槿斯ぶ悄芩惴梢詫W(xué)會(huì)隱藏那些泄露偽造的東西。早期的深度偽裝檢測(cè)方法包括跟蹤不自然的眨眼和奇怪的嘴唇運(yùn)動(dòng),但最新的“深度換臉”已經(jīng)學(xué)會(huì)自動(dòng)消除這些小故障。
Earlier this year, Matt Turek, DARPA program manager for MediFor, asked Li to demonstrate his fakes to the MediFor researchers. This led to a collaboration with Hany Farid, a professor at UC Berkeley and one of the world’s foremost authorities on digital forensics. The pair are now engaged in a digital game of cat-and-mouse, with Li developing deepfakes for Farid to catch, and then refining them to evade detection.
今年早些時(shí)候,DARPA MediFor項(xiàng)目經(jīng)理馬特·塔瑞克要求黎顥向MediFor的研究人員展示他的偽造品。這促成了黎顥與加州大學(xué)伯克利分校教授哈尼·法里德的合作。法里德是世界上最權(quán)威的數(shù)字取證專家之一?,F(xiàn)在,這對(duì)伙伴正在玩一場(chǎng)貓捉老鼠的數(shù)字游戲,黎顥為法里德研發(fā)“深度換臉”,讓他去捕捉,然后黎顥再對(duì)其進(jìn)行改善,以躲避偵查。
Farid, Li, and others recently released a paper outlining a new, more powerful way to spot deepfakes. It hinges on training a machine-learning algorithm to recognize the quirks of a specific individual’s facial expressions and head movements. If you simply paste someone’s likeness onto another face, those features won’t be carried over. It would require a lot of computer power and training data—i.e., images or video of the person—to make a deepfake that incorporates these characteristics. But one day it will be possible. “Technical solutions will continue to improve on the defensive side,” says Turek. “But will that be perfect? I doubt it.”
法里德、黎顥和其他人最近發(fā)表了一篇論文,概述了一種新的、更有效的識(shí)別“深度換臉”的方法。這種方法依賴于訓(xùn)練一種機(jī)器學(xué)習(xí)算法來識(shí)別特定個(gè)體面部表情和頭部運(yùn)動(dòng)的怪癖。如果你只是簡(jiǎn)單地把某人的肖像粘貼到另一張臉上,這些特征就不會(huì)被保留下來。這將需要大量的計(jì)算機(jī)能力和訓(xùn)練數(shù)據(jù)——也就是這些人的圖像或視頻——來制作一個(gè)包含這些特征的“深度換臉”。但總有一天這是可能的,“在防守方面,技術(shù)解決方案將繼續(xù)改進(jìn)”,馬特·塔瑞克表示?!暗@會(huì)使偽造變得更加完美嗎?我對(duì)此表示懷疑?!?/p>
Pixel perfect完美像素
Back in Dalian, it’s clear that people are starting to wake up to the danger of deepfakes. The morning before I met with Li, a European politician had stepped into the face-swap booth, only for his minders to stop him. They were worried that the system might capture his likeness in detail, making it easier for someone to create fake clips of him.
回到大連的世界經(jīng)濟(jì)論壇,很明顯,人們已經(jīng)開始意識(shí)到“深度換臉”的危險(xiǎn)。在我與黎顥會(huì)面的前一天早上,一位歐洲政客走進(jìn)“換臉”展位,結(jié)果被他的保鏢攔了下來,他們擔(dān)心該系統(tǒng)可能會(huì)捕捉到他的詳細(xì)肖像,讓人們更容易偽造他的視頻片段。
A Pinscreen employee demonstrates a live face-swap system at the World Economic Forum's conference in Dalian, China in July.COURTESY OF HAO LI
今年7月,在大連舉行的世界經(jīng)濟(jì)論壇會(huì)議上,一名Pinscreen員工展示了一個(gè)實(shí)時(shí)換臉系統(tǒng)。圖片由黎顥提供
As he watches people using the booth, Li tells me that there is no technical reason why deepfakes should be detectable. “Videos are just pixels with a certain color value,” he says.
看著人們使用展位上的裝置時(shí),黎顥告訴我:從技術(shù)上講,“深度換臉”是無法檢測(cè)出來的?!耙曨l只是具有特定顏色值的像素,”他說。
Making them perfect is just a matter of time and resources, and as his collaboration with Farid shows, it’s getting easier all the time. “We are witnessing an arms race between digital manipulations and the ability to detect those,” he says, “with advancements of AI-based algorithms catalyzing both sides.”
讓“深度換臉”日益完美只是一個(gè)時(shí)間和資源的問題,正如他與法里德的合作所展現(xiàn)出來的,這將變得越來越容易。他表示:“我們正在見證一場(chǎng)數(shù)字操作和檢測(cè)能力之間的軍備競(jìng)賽,基于人工智能算法的進(jìn)步,促進(jìn)了這兩方的發(fā)展?!?/p>
The bad news, Li thinks, is that he will eventually win. In a few years, he reckons, undetectable deepfakes could be created with a click. “When that point comes,” he says, “we need to be aware that not every video we see is true.”
黎顥認(rèn)為,壞消息是在這場(chǎng)博弈里他終究還是會(huì)贏。據(jù)他估計(jì),若干年后,只需要輕輕一點(diǎn)擊,就能制造出無法甄別出的“深度換臉”品?!暗搅四莻€(gè)時(shí)候,我們需要意識(shí)到,眼見未必為實(shí)。”
?
注:《探秘被封殺的ins黑科技『深度換臉』》來源于MIT Technology Review(點(diǎn)擊查看原文)。本文系數(shù)據(jù)觀原創(chuàng)編譯,譯者數(shù)據(jù)觀/王婕,轉(zhuǎn)載請(qǐng)務(wù)必注明譯者和來源。
?
責(zé)任編輯:張薇