Lying Sophia &
Mocking Alexa


Text Iris Long 文 龙星如

Exhibition Structure 展览结构图

Sophia, the humanoid robot who became a Saudi Arabian citizen is interpreted by many as a story intertwined with elements of ambiguity and deception co-compiled by the mass media and technological companies. Alexa, the cloud-based virtual assistant developed by Amazon, was reported about unsettlingly letting out eerie laughters, which soon became viral on YouTube.

Sophia and Alexa seem to be two contemporary metaphors on machine lives, two thin slices interposed among the imbricated discourses on artificial intelligence. Sophia symbolizes the imagination on AI casted by the mass media, films and televisions: highly human-imitating appearances, alert and responsive, and even diplomatic - a quasi-human being embedded within us. Alexa, on the other hand, is an “assistant” or “servant” who takes a machine outlook and resides in domestic corners, whose laughter implies the non-transparent,  anti-regulating, even peeping, subversive dimension of the artificial intelligence black box - even a “mistake” to be amended.

Sophia’s lies are projections of poetic imaginations, Alexa’s mocking is glitches on the algorithmic black box, what they share in common, is a quantum-state like scenario of uncertainties, as if the “La Zona” in Andrei Tarkovsky’s Stalker. In the alternations and evolutions of technologies, we’ve rarely encountered such subject as the artificial intelligence: it’s paradoxical, mind stimulating, and implies manifold future potentials. Even AI has been ubiquitously employed by microchips, processors, data mining and analysis, forming the new frontier of a global technological competition, it remains imperceptible and equivocal to a normal citizen - wrapped within the information on mass media, AI has transformed into a story both the easiest to tell, and the most difficult to narrate.

In Tarkovsky’s script, the stalker guides writer and a scientist to take a cable car, steer by the policemen’s chase, traverse tunnels of dripping water, detour rooms filled with sand dunes, and finally approximate the core of “La Zona”: a “Room” that makes beliefs true. The writer concerns about the dark human nature the Room suggests, while the scientist wishes to destroy the Room in case villains would take advantage of it. The exhibition sets up a metaphorical “La Zona” which embodies our contemporary situation: a time-space where both science and art are simultaneously deprived the power of autocracy and believing narratives, and filled with the writer and the scientist’s chattering.

Artists and researchers involved in this exhibition blend perspectives of Sophia(bright, poetic, media imagination) and Alexa (dark, black-box, technological criticism), they investigate how AI shuffles global technical politics, reconstruct the earth’s geologies, the absurdity of quantifying human emotions, the dark, inhuman labor (in exhausting fashion) to train “human-like” algorithms,  incentives to project the entire human spiritual architecture on one single technology form, and the fairy-tale building on AI conducted by mass media.

“Sophia” and “Alexa” are embodied in the exhibition by text and sound generated by AI algorithms, and weave through it as dialogues. Walking through the exhibition is as if in “stalker”, it interweaves the richness, non-computability and vitality of the psychological world. Would all we are experiencing as a whole “break all the prophecies”, like the event horizon in Vernor Steffen Vinge’s assertions, or be “the biggest mistake we have ever made” in Steve Hawking’s alerts?

被授予沙特国籍的机器人“索菲亚” 被阐释为媒体和技术企业撰写的暧昧骗局。亚马逊的智能助手艾莉克莎(Alexa)屡次被录下发出“可怖笑声” 的瞬间,一时成为风行于YouTube 的都市传说。

“艾莉克莎” 和“索菲亚” 像是关于机器生命的两个当代隐喻,两块安插在人工智能庞杂话题间的薄片。索菲亚象征媒体和影视里对AI 具有高度拟真容貌、机敏回复力,甚至懂外交的想象——一个行走于我们之间的类人。艾莉克莎是拥有机器外形、存活于私家角落的“助手” 或“仆从”,它的笑声象征关于AI “黑盒” 之不透明、不可规训和潜在窥伺、颠覆的面向——一个需要被修正的错误。

索菲亚的谎言是诗意想象的投射,艾莉克莎的嘲讽是算法黑箱的裂痕,她们共享一种不明朗的处境,犹如《潜行者 》(塔可夫斯基)里的“区” (La Zona)。在科技更迭里,我们很少遇见像AI 一样内含重重悖论,刺激心智,进而指向未来多种可能的课题。哪怕今天AI 已经普世地运用于芯片、处理器、数据收集与分析层面,形成全球技术竞争的新前线,对一个普通人来说,它依然不直接可知、模棱两可,是一个在大众媒体的信息包裹里最好讲也最难讲的故事。

在塔可夫斯基的脚本里,潜行者带着作家与科学家坐缆车,躲过警察追击,穿过滴水隧道,绕过充满沙丘之屋,才接近了“区” 的核心:一个信念成真的房间。作家恐惧于它所暗示的卑陋人性,而科学家希望摧毁房间以免它为恶人所用。展览建构的“区” 犹如今天我们的处境,是科学和文艺同时失去独裁力的架空之所,充斥着“作家”们与“科学家”们的喋喋不休。

展览邀请的艺术家与研究者兼容了索菲亚(光明、诗歌、媒体想象)与艾莉克莎(阴翳、黑盒、技术批判)的视角,探讨AI 对全球技术政治洗牌、其涉及的资源和地质改造、量化感情的荒诞、用真人训练“人性”算法的黑色劳动、投射人类整体精神建筑的动因、AI 的媒介化包装等议题。

索菲亚与艾莉克莎两个角色,也化身人工智能程序所生成的文本及声音,以“对话”的形式贯穿展览。穿越展览的过程如一场“潜行”,它交织着心理世界的饱满、无法计算与一线生机。我们正在经历的一切,究竟会如黑洞的事件穹界一般“打破所有预言”(弗诺· 文奇),还是“我们犯过的最大错误”(史蒂芬· 霍金)?

Iris Long

Researcher on Art, Science and Technology
Central Academy of Fine Arts, Beijing 


Lying Sophia &
Mocking Alexa


Text HE Di 文 贺笛

In recent years, deep learning has pushed the limits of many real applications, including speech recognition [1], image classification [2], and machine translation [3]. Deep neural network-based models have even achieved super-human performance in many challenging game environments such as Go [4], StarCraft [5] and Dota2 [6]. The keys to the success of deep learning span in many aspects including advanced neural network architectures [2,3], modern optimization algorithms [7], massive data and huge computational power [4,6].

In this project, we mainly leverage deep learning models in natural language processing. The conversations are generated in three steps: conditional sentence generation for the English version, text-to-text translation from English to Chinese and text-to-speech translation. We briefly introduce the basic knowledge of the deep learning models we used as below.

In the conditional sentence generation step, we use the GPT-2 model [8] which is the current state-of-the-art language generation model based on the Transformer architecture [3,9]. The model is trained to predict the distribution of the next word conditioned on its proceeding words in a sentence using 8 million English web documents, which roughly corresponds to 40 GB plain texts. As the model can predict proper words given any context, we can use it to generate a sentence word by word autoregressively.

We use the open-sourced GPT-2 medium model which contains 330 millions of parameters. In particular, for Alexa and Sophia, we feed the GPT-2 model with hand-craft sentence beginnings. For example, we create a sentence beginning ``Alexa can help human’’ and use the GPT-2 model to generate a sentence automatically from it. Note that the neural language model is a probabilistic generative model, we can sample different outputs in different rounds.  For each sentence beginning for Alexa and Sophia, we randomly sample 512 sentences and follow to use the suggested hyperparameter in [8]. We set the temperature to be 1.0, set the top-k number to be 40 to balance accuracy and diversity and set the maximum sentence length to be 128. We create 81 different sentence beginnings for Alexa and Sophia and finally obtain 80,000 sentences with 1000,000 words in total. We randomly organize the sentences from Alexa and Sophia and form them into conversations.

Given the generated English contexts, we translate each sentence from English to Chinese using Google Translator. As far as we know, Google Translator uses the Transformer model trained from millions of bilingual sentences of the two languages. Generally speaking, given a sentence in English, the Transformer encoder will first encode the sentence into contexts which are usually real-valued vectors. Then the Transformer decoder will decode the encoded contexts using stack of attentive layers and generate the word sequence in Chinese. In the last step, we translate the texts into voices using APIs from iFLYTEK.

[1]. Hinton, Geoffrey, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior et al. "Deep neural networks for acoustic modeling in speech recognition." IEEE Signal processing magazine 29 (2012).

[2]. He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Deep residual learning for image recognition." CVPR 2016.

[3]. Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. "Attention is all you need." NIPS 2017.

[4]. AlphaGo,, DeepMind, 2017.

[5]. AlphaStar: Mastering the Real-Time Strategy Game StarCraft II,, DeepMind, 2019.

[6]. OpenAI Five., OpenAI, 2019.

[7]. Du, Simon S., Jason D. Lee, Haochuan Li, Liwei Wang, and Xiyu Zhai. Gradient descent finds global minima of deep neural networks. ICML 2019.

[8]. Radford, Alec, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. "Language models are unsupervised multitask learners." OpenAI Blog 1, no. 8 (2019).

[9]. Yiping Lu, Zhuohan Li, Di He, Zhiqing Sun, Bin Dong, Tao Qin, Liwei Wang, Tie-yan Liu. Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View. arXiv preprint:1906.02762

EXHIBITION PRESS & Programs 展览新闻和项目

Lying Sophia & Mocking Alexa

Lying Sophia & Mocking Alexa Touring to
YUZ Museum in Shanghai

Lying Sophia & Mocking Alexa Opens at 
Hyundai Motorstudios in Beijing


Three Thousand Years of Algorithmic Rituals: The Emergence of AI from the Computation of Space



Text Matteo Pasquinelli 文 马蒂欧·帕斯克奈利

图片来源:弗里茨·施塔尔,《希腊与吠陀几何学》,印度哲学期刊 27.1 (1999): 105-127.
Illustration from Frits Staal, "Greek and Vedic geometry" Journal of Indian Philosophy 27.1 (1999): 105-127.

1. Recomposing a Dismembered God

In a fascinating myth of cosmogenesis from the ancient Vedas, it is said that the god Prajapati was shattered into pieces by the act of creating the universe. After the birth of the world, the supreme god is found dismembered, undone. In the corresponding Agnicayana ritual, Hindu devotees symbolically recompose the fragmented body of the god by building a fire altar according to an elaborate geometric plan.2 The fire altar is laid down by aligning thousands of bricks of precise shape and size to create the profile of a falcon. Each brick is numbered and placed while reciting its dedicated mantra, following step-by-step instructions. Each layer of the altar is built on top of the previous one, conforming to the same area and shape. Solving a logical riddle that is the key of the ritual, each layer must keep the same shape and area of the contiguous ones, but using a different configuration of bricks. Finally, the falcon altar must face east, a prelude to the symbolic flight of the reconstructed god towards the rising sun—an example of divine reincarnation by geometric means.

The Agnicayana ritual is described in the Shulba Sutras, composed around 800 BCE in India to record a much older oral tradition. The Shulba Sutras teach the construction of altars of specific geometric forms to secure gifts from the gods: for instance, they suggest that “those who wish to destroy existing and future enemies should construct a fire-altar in the form of a rhombus.”3 The complex falcon shape of the Agnicayana evolved gradually from a schematic composition of only seven squares. In the Vedic tradition, it is said that the Rishi vital spirits created seven square-shaped Purusha (cosmic entities, or persons) that together composed a single body, and it was from this form that Prajapati emerged once again. While art historian Wilhelm Worringer argued in 1907 that primordial art was born in the abstract line found in cave graffiti, one may assume that the artistic gesture also emerged through the composing of segments and fractions, introducing forms and geometric techniques of growing complexity. 4In his studies of Vedic mathematics, Italian mathematician Paolo Zellini has discovered that the Agnicayana ritual was used to transmit techniques of geometric approximation and incremental growth—in other words, algorithmic techniques—comparable to the modern calculus of Leibniz and Newton.5 Agnicayana is among the most ancient documented rituals still practiced today in India, and a primordial example of algorithmic culture.

But how can we define a ritual as ancient as the Agnicayana as algorithmic? To many, it may appear an act of cultural appropriation to read ancient cultures through the paradigm of the latest technologies. Nevertheless, claiming that abstract techniques of knowledge and artificial metalanguages belong uniquely to the modern industrial West is not only historically inaccurate but also an act and one of implicit epistemic colonialism towards cultures of other places and other times.6 The French mathematician Jean-Luc Chabert has noted that “algorithms have been around since the beginning of time and existed well before a special word had been coined to describe them. Algorithms are simply a set of step by step instructions, to be carried out quite mechanically, so as to achieve some desired result.”7 Today some may see algorithms as a recent technological innovation implementing abstract mathematical principles. On the contrary, algorithms are among the most ancient and material practices, predating many human tools and all modern machines:

Algorithms are not confined to mathematics … The Babylonians used them for deciding points of law, Latin teachers used them to get the grammar right, and they have been used in all cultures for predicting the future, for deciding medical treatment, or for preparing food … We therefore speak of recipes, rules, techniques, processes, procedures, methods, etc., using the same word to apply to different situations. The Chinese, for example, use the word shu (meaning rule, process or stratagem) both for mathematics and in martial arts … In the end, the term algorithm has come to mean any process of systematic calculation, that is a process that could be carried out automatically. Today, principally because of the influence of computing, the idea of finiteness has entered into the meaning of algorithm as an essential element, distinguishing it from vaguer notions such as process, method or technique.8

Before the consolidation of mathematics and geometry, ancient civilizations were already big machines of social segmentation that marked human bodies and territories with abstractions that remained, and continue to remain, operative for millennia. Drawing also on the work of historian Lewis Mumford, Gilles Deleuze and Félix Guattari offered a list of such old techniques of abstraction and social segmentation: “tattooing, excising, incising, carving, scarifying, mutilating, encircling, and initiating.”9 Numbers were already components of the “primitive abstract machines” of social segmentation and territorialization that would make human culture emerge: the first recorded census, for instance, took place around 3800 BCE in Mesopotamia. Logical forms that were made out of social ones, numbers materially emerged through labor and rituals, discipline and power, marking and repetition.

In the 1970s, the field of “ethnomathematics” began to foster a break from the Platonic loops of elite mathematics, revealing the historical subjects behind computation.10 The political question at the center of the current debate on computation and the politics of algorithms is ultimately very simple, as Diane Nelson has reminded us: Who counts?11 Who computes? Algorithms and machines do not compute for themselves; they always compute for someone else, for institutions and markets, for industries and armies.

Illustration from Frank Rosenblatt, Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms, (Cornell Aeronautical Laboratory, Buffalo NY, 1961).

2. What Is an Algorithm?

The term “algorithm” comes from the Latinization of the name of the Persian scholar al-Khwarizmi. His tract On the Calculation with Hindu Numerals, written in Baghdad in the ninth century, is responsible for introducing Hindu numerals to the West, along with the corresponding new techniques for calculating them, namely algorithms. In fact, the medieval Latin word “algorismus” referred to the procedures and shortcuts for carrying out the four fundamental mathematical operations—addition, subtraction, multiplication, and division—with Hindu numerals. Later, the term “algorithm” would metaphorically denote any step-by-step logical procedure and become the core of computing logic. In general, we can distinguish three stages in the history of the algorithm: in ancient times, the algorithm can be recognized in procedures and codified rituals to achieve a specific goal and transmit rules; in the Middle Ages, the algorithm was the name of a procedure to help mathematical operations; in modern times, the algorithm qua logical procedure becomes fully mechanized and automated by machines and then digital computers.

Looking at ancient practices such as the Agnicayana ritual and the Hindu rules for calculation, we can sketch a basic definition of “algorithm” that is compatible with modern computer science: (1) an algorithm is an abstract diagram that emerges from the repetition of a process, an organization of time, space, labor, and operations: it is not a rule that is invented from above but emerges from below; (2) an algorithm is the division of this process into finite steps in order to perform and control it efficiently; (3) an algorithm is a solution to a problem, an invention that bootstraps beyond the constrains of the situation: any algorithm is a trick; (4) most importantly, an algorithm is an economic process, as it must employ the least amount of resources in terms of space, time, and energy, adapting to the limits of the situation.

Today, amidst the expanding capacities of AI, there is a tendency to perceive algorithms as an application or imposition of abstract mathematical ideas upon concrete data. On the contrary, the genealogy of the algorithm shows that its form has emerged from material practices, from a mundane division of space, time, labor, and social relations. Ritual procedures, social routines, and the organization of space and time are the source of algorithms, and in this sense they existed even before the rise of complex cultural systems such as mythology, religion, and especially language. In terms of anthropogenesis, it could be said that algorithmic processes encoded into social practices and rituals were what made numbers and numerical technologies emerge, and not the other way around. Modern computation, just looking at its industrial genealogy in the workshops studied by both Charles Babbage and Karl Marx, evolved gradually from concrete towards increasingly abstract forms.

Illustration from Frank Rosenblatt, Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms, (Cornell Aeronautical Laboratory, Buffalo NY, 1961).

3. The Rise of Machine Learning as Computational Space

In 1957, at the Cornell Aeronautical Laboratory in Buffalo, New York, the cognitive scientist Frank Rosenblatt invented and constructed the Perceptron, the first operative artificial neural network—grandmother of all the matrices of machine learning, which at the time was a classified military secret.12The first prototype of the Perceptron was an analogue computer composed of an input device of 20 × 20 photocells (called the “retina”) connected through wires to a layer of artificial neurons that resolved into one single output (a light bulb turning on or off, to signify 0 or 1). The “retina” of the Perceptron recorded simple shapes such as letters and triangles and passed electric signals to a multitude of neurons that would compute a result according to a threshold logic. The Perceptron was a sort of photo camera that could be taught to recognize a specific shape, i.e., to make a decision with a margin of error (making it an “intelligent” machine). The Perceptron was the first machine-learning algorithm, a basic “binary classifier” that could determine whether a pattern fell within a specific class or not (whether the input image was a triangle or not, a square or not, etc.). To achieve this, the Perceptron progressively adjusted the values of its nodes in order to resolve a large numerical input (a spatial matrix of four hundred numbers) into a simple binary output (0 or 1). The Perceptron gave the result 1 if the input image was recognized within a specific class (a triangle, for instance); otherwise it gave the result 0. Initially, a human operator was necessary to train the Perceptron to learn the correct answers (manually switching the output node to 0 or 1), hoping that the machine, on the basis of these supervised associations, would correctly recognize similar shapes in the future. The Perceptron was designed not to memorize a specific pattern but to learnhow to recognize potentially any pattern.

The matrix of 20 × 20 photoreceptors in the first Perceptron was the beginning of a silent revolution in computation (which would become a hegemonic paradigm in the early twenty-first century with the advent of “deep learning,” a machine-learning technique). Although inspired by biological neurons, from a strictly logical point of view the Perceptron marked not a biomorphic turn in computation but a topologicalone; it signified the rise of the paradigm of “computational space” or “self-computing space.” This turn introduced a second spatial dimension into a paradigm of computation that until then had only a linear dimension (see the Turing machine that reads and writes 0 and 1 along a linear memory tape). This topological turn, which is the core of what people perceive today as “AI,” can be described more modestly as the passage from a paradigm of passive information to one of active information. Rather than having a visual matrix processed by a top-down algorithm (like any image edited by a graphics software program today), in the Perceptron the pixels of the visual matrix are computed in a bottom-up fashion according to their spatial disposition. The spatial relations of the visual data shape the operation of the algorithm that computes them.

Because of its spatial logic, the branch of computer science originally dedicated to neural networks was called “computational geometry.” The paradigm of computational space or self-computing space shares common roots with the studies of the principles of self-organization that were at the center of post-WWII cybernetics, such as John von Neumann’s cellular automata (1948) and Konrad Zuse’s Rechnender Raum by (1967).13 Von Neumann’s cellular automata are cluster of pixels, perceived as small cells on a grid, that change status and move according to their neighboring cells, composing geometric figures that resemble evolving forms of life. Cellular automata have been used to simulate evolution and to study complexity in biological systems, but they remain finite-state algorithms confined to a rather limited universe. Konrad Zuse (who built the first programmable computer in Berlin in 1938) attempted to extend the logic of cellular automata to physics and to the whole universe. His idea of “rechnender Raum,” or calculating space, is a universe that is composed of discrete units that behave according to the behavior of neighboring units. Alan Turing’s last essay, “The Chemical Basis of Morphogenesis” (published in 1952, two years before his death), also belongs to the tradition of self-computing structures.14 Turing considered molecules in biological systems as self-computing actors capable of explaining complex bottom-up structures, such as tentacle patterns in hydra, whorl arrangement in plants, gastrulation in embryos, dappling in animal skin, and phyllotaxis in flowers.15

Von Neumann’s cellular automata and Zuse’s computational space are intuitively easy to understand as spatial models, while Rosenblatt’s neural network displays a more complex topology that requires more attention. Indeed, neural networks employ an extremely complex combinatorial structure, which is probably what makes them the most efficient algorithms for machine learning. Neural networks are said to “solve any problem,” meaning they can approximate the function of any pattern according to the Universal Approximation theorem (given enough layers of neurons and computing resources). All systems of machine learning, including support-vector machines, Markov chains, Hopfield networks, Boltzmann machines, and convolutional neural networks, to name just a few, started as models of computational geometry. In this sense they are part of the ancient tradition of ars combinatoria.16

Image from Hans Meinhardt, The Algorithmic Beauty of Sea Shells (Springer Science & Business Media, 2009).

4. The Automation of Visual Labor

Even at the end of the twentieth century, no one would have ever thought to call a truck driver a “cognitive worker,” an intellectual. At the beginning of the twenty-first century, the use of machine learning in the development of self-driving vehicles has led to a new understanding of manual skills such as driving, revealing how the most valuable component of work, generally speaking, has never been merely manual, but also social and cognitive (as well as perceptual, an aspect of labor still waiting to be located somewhere between the manual and the cognitive). What kind of work do drivers perform? Which human task will AI come to record with its sensors, imitate with its statistical models, and replace with automation? The best way to answer this question is to look at what technology has successfully automated, as well as what it hasn’t.

The industrial project to automate driving has made clear (more so than a thousand books on political economy) that the labor of driving is a conscious activity following codified rules and spontaneous social conventions. However, if the skill of driving can be translated into an algorithm, it will be because driving has a logical and inferential structure. Driving is a logical activity just as labor is a logical activity more generally. This postulate helps to resolve the trite dispute about the separation between manual labor and intellectual labor.17 It is a political paradox that the corporate development of AI algorithms for automation has made possible to recognize in labor a cognitive component that had long been neglected by critical theory. What is the relation between labor and logic? This becomes a crucial philosophical question for the age of AI.

A self-driving vehicle automates all the micro-decisions that a driver must make on a busy road. Its artificial neural networks learn, that is imitate and copy, the human correlations between the visual perception of the road space and the mechanical actions of vehicle control (steering, accelerating, stopping) as ethical decisions taken in a matter of milliseconds when dangers arise (for the safety of persons inside and outside the vehicle). It becomes clear that the job of driving requires high cognitive skills that cannot be left to improvisation and instinct, but also that quick decision-making and problem-solving are possible thanks to habits and training that are not completely conscious. Driving remains essentially also a social activity, which follows both codified rules (with legal constraints) and spontaneous ones, including a tacit “cultural code” that any driver must subscribe to. Driving in Mumbai—it has been said many times—is not the same as driving in Oslo.

Obviously, driving summons an intense labor of perception. Much labor, in fact, appears mostly perceptive in nature, through continuous acts of decision and cognition that take place in the blink of an eye.18Cognition cannot be completely disentangled from a spatial logic, and often follows a spatial logic in its more abstract constructions. Both observations—that perception is logical and that cognition is spatial—are empirically proven without fanfare by autonomous driving AI algorithms that construct models to statistically infer visual space (encoded as digital video of a 3-D road scenario). Moreover, the driver that AI replaces in self-driving cars and drones is not an individual driver but a collective worker, a social brain that navigates the city and the world.19 Just looking at the corporate project of self-driving vehicles, it is clear that AI is built on collective data that encode a collective production of space, time, labor, and social relations. AI imitates, replaces, and emerges from an organized division of social space (according first to a material algorithm and not the application of mathematical formulas or analysis in the abstract).

Animation from Chris Urmson’s, Ted talk “How a Driverless Car Sees the Road.” Urmson is the former chief engineer for Google’s Self-Driving Car Project. Animation by ZMScience.

5. The Memory and Intelligence of Space

Paul Virilio, the French philosopher of speed or “dromology,” was also a theorist of space and topology, for he knew that technology accelerates the perception of space as much as it morphs the perception of time. Interestingly, the title of Virilio’s book The Vision Machine was inspired by Rosenblatt’s Perceptron. With the classical erudition of a twentieth-century thinker, Virilio drew a sharp line between ancient techniques of memorization based on spatialization, such as the Method of Loci, and modern computer memory as a spatial matrix:

Cicero and the ancient memory-theorists believed you could consolidate natural memory with the right training. They invented a topographical system, the Method of Loci, an imagery-mnemonics which consisted of selecting a sequence of places, locations, that could easily be ordered in time and space. For example, you might imagine wandering through the house, choosing as loci various tables, a chair seen through a doorway, a windowsill, a mark on a wall. Next, the material to be remembered is coded into discreet images and each of the images is inserted in the appropriate order into the various loci. To memorize a speech, you transform the main points into concrete images and mentally “place” each of the points in order at each successive locus. When it is time to deliver the speech, all you have to do is recall the parts of the house in order.

The transformation of space, of topological coordinates and geometric proportions, into a technique of memory should be considered equal to the more recent transformation of collective space into a source of machine intelligence. At the end of the book, Virilio reflects on the status of the image in the age of “vision machines” such as the Perceptron, sounding a warning about the impending age of artificial intelligence as the “industrialisation of vision”:

“Now objects perceive me,” the painter Paul Klee wrote in his Notebooks. This rather startling assertion has recently become objective fact, the truth. After all, aren’t they talking about producing a “vision machine” in the near future, a machine that would be capable not only of recognizing the contours of shapes, but also of completely interpreting the visual field … ? Aren’t they also talking about the new technology of visionics: the possibility of achieving sightless vision whereby the video camera would be controlled by a computer? … Such technology would be used in industrial production and stock control; in military robotics, too, perhaps.

Now that they are preparing the way for the automation of perception, for the innovation of artificial vision, delegating the analysis of objective reality to a machine, it might be appropriate to have another look at the nature of the virtual image … Today it is impossible to talk about the development of the audiovisual … without pointing to the new industrialization of vision, to the growth of a veritable market in synthetic perception and all the ethical questions this entails … Don’t forget that the whole idea behind the Perceptron would be to encourage the emergence of fifth-generation “expert systems,” in other words an artificial intelligence that could be further enriched only by acquiring organs of perception.20

Ioannis de Sacro Busco, Algorismus Domini, c. 1501. National Central Library of Rome. Photo: Public Domain/Internet Archive.

6. Conclusion

If we consider the ancient geometry of the Agnicayana ritual, the computational matrix of the first neural network Perceptron, and the complex navigational system of self-driving vehicles, perhaps these different spatial logics together can clarify the algorithm as an emergent form rather than a technological a priori. The Agnicayana ritual is an example of an emergent algorithm as it encodes the organization of a social and ritual space. The symbolic function of the ritual is the reconstruction of the god through mundane means; this practice of reconstruction also symbolizes the expression of the many within the One (or the “computation” of the One through the many). The social function of the ritual is to teach basic geometry skills and to construct solid buildings.21 The Agnicayana ritual is a form of algorithmic thinking that follows the logic of a primordial and straightforward computational geometry.

The Perceptron is also an emergent algorithm that encodes according to a division of space, specifically a spatial matrix of visual data. The Perceptron’s matrix of photoreceptors defines a closed field and processes an algorithm that computes data according to their spatial relation. Here too the algorithm appears as an emergent process—the codification and crystallization of a procedure, a pattern, after its repetition. All machine-learning algorithms are emergent processes, in which the repetition of similar patterns “teach” the machine and cause the pattern to emerge as a statistical distribution.22

Self-driving vehicles are an example of complex emergent algorithms since they grow from a sophisticated construction of space, namely, the road environment as social institution of traffic codes and spontaneous rules. The algorithms of self-driving vehicles, after registering these spontaneous rules and the traffic codes of a given locale, try to predict unexpected events that may happen on a busy road. In the case of self-driving vehicles, the corporate utopia of automation makes the human driver evaporate, expecting that the visual space of the road scenario alone will dictate how the map will be navigated.

The Agnicayana ritual, the Perceptron, and the AI systems of self-driving vehicles are all, in different ways, forms of self-computing space and emergent algorithms (and probably, all of the them, forms of the invisibilization of labor).

The idea of computational space or self-computing space stresses, in particular, that the algorithms of machine learning and AI are emergent systems that are based on a mundane and material division of space, time, labor, and social relations. Machine learning emerges from grids that continue ancient abstractions and rituals concerned with marking territories and bodies, counting people and goods; in this way, machine learning essentially emerges from an extended division of social labor. Despite the way it is often framed and critiqued, artificial intelligence is not really “artificial” or “alien”: in the usual mystification process of ideology, it appears to be a deus ex machina that descends to the world like in ancient theater. But this hides the fact that it actually emerges from the intelligence of this world.

What people call “AI” is actually a long historical process of crystallizing collective behavior, personal data, and individual labor into privatized algorithms that are used for the automation of complex tasks: from driving to translation, from object recognition to music composition. Just as much as the machines of the industrial age grew out of experimentation, know-how, and the labor of skilled workers, engineers, and craftsmen, the statistical models of AI grow out of the data produced by collective intelligence. Which is to say that AI emerges as an enormous imitation engine of collective intelligence. What is the relation between artificial intelligence and human intelligence? It is the social division of labor.


Matteo Pasquinelli (PhD) is Professor in Media Philosophy at the University of Arts and Design, Karlsruhe, where he coordinates the research group KIM (Künstliche Intelligenz und Medienphilosophie / Artificial Intelligence and Media Philosophy). For Verso he is preparing a monograph on the genealogy of artificial intelligence as division of labor, which is titled The Eye of the Master: Capital as Computation and Cognition.
© 2019 e-flux and the author

1. 重新拼凑一位被肢解的神灵

在古《吠陀经》里,有一段关于宇宙发生论的令人着迷的描述:名为“波阇波提”(Prajapati)的神明被创世的行动肢解成了碎片。在世界诞生之后,人们发现这位这位至上神明的支离破碎的躯体。在对应的梵文仪式火坛祭( Agnicayana)中,印度教信徒会象征性地“重组”这位神灵的身体。他们根据一个详尽的几何图形,堆砌起一个熊熊燃烧的祭坛。这个祭坛是由上千块有着精准形状和尺寸的砖石铺砌而成的,最终形成一只鹰隼的轮廓。每块砖上都标记了序号,信众专注地把它们按照次序排列,同时根据明确的步骤,吟诵着咒文。祭坛的每一层都被盖筑在另一层之上,形成完全一致的形状,覆盖相同的面积。这种教仪的关键在于解决一个逻辑“谜语”:祭坛的每一层都需要和上一层的形状与面积保持一致,但是砖块的排列方法完全不同。此外,有着鹰隼图样的祭坛必须面向东方,这是这位重构出来的神明象征性地飞向日升的东方的序曲——整个故事,如同一个通过几何的方式实现神性轮回的案例。

上述的火坛祭在 Shulba Sutras中有详尽的描述,大约成书于公元前800年前,记录了更早的口述传统。 Shulba Sutras 教导人们如何根据特定的几何形状去建造神探,以保证神的旨意得以传承:比方说,他们建议“那些意图摧毁当下和未来之敌人的人们,应当按照斜方形来构筑火坛子。”3


1907年艺术史学家威廉·沃林格(Wilhelm Worringer)曾言,原始艺术发源于洞穴绘画里的抽象线条,或许我们也可以假设,许多艺术特征也来自于对碎片和片段的重组,以及这一过程中引入的形式和几何技法,直到人们有能力创造出更高的复杂性。4 在对吠陀数学的研究中,意大利数学家保罗·杰里尼(Paolo Zellini)发现火坛祭也用于传递数学技法,具体包括几何上的近似法和增量改变——换言之,这些都是“计算”技法,相当于莱布尼茨与牛顿所建立的当代微积分学。5 火坛祭或许是现今仍然在流传的,最早有迹可循的古代祭祀活动,也是“计算文化”在原始时期的一丝线索。


法国数学家让-吕克·夏伯特(Jean-Luc Chabert)曾说
“算法在时间之初便已经存在,并且在我们确定一个特定的词语来描述它们之前就存在。算法仅仅指的是一系列按照步骤进行的指令,并且有机制地执行,以得到某种期望的结果。”7 今日,许多人或许认为算法是一种近代技术发明,它代表对抽象数学原则的运用。恰恰相反,算法或许是最为古老的一种实践,同时它也是“物理”的,远远出现在许多人类工具和当代机器诞生之前:



在代数与几何学最终成形之前,古代文明已经某种意义上是种巨型机器,它对社会进行精准分割,用一系列抽象过程对人的身体和领土进行标注,这些标注方法曾持续,或许也将继续持续运转上千年。德勒兹(Gilles Deleuze)和瓜塔利(Félix Guattari)在参考了历史学家刘易斯·芒福德(Lewis Mumford)的一些工作后,提出了一个此类对社会进行抽象化和分割的古代技巧的清单,其中包括:“纹身、切除、切割、雕刻、划、切断、环绕和发起”9 数字本身也是关于社会分割和地区分配的“早期抽象机器”的组成部分,而正是这些机器促生了人类文明:史料记载的最早的人口调查发生于公元前3800年的美索不达米亚地区。逻辑形式脱胎于社会形式,数字的概念通过劳动和仪式、纪律和权利、标注和重复等一系列社会过程,成为物理现实。

在20世纪70年代,国际数学界兴起了关于“民族数学”的研究,它打破了精英数学的柏拉图式循环,并开始展现计算概念背后的一系列历史课题。10 今日被热议的关于计算和算法的政治学,或许本质上非常简单,正如戴安·尼尔森(Diane Nelson)曾提醒我们的:“谁是数数的人?”11 “谁是进行计算的人?”



2. 何为算法?

“算法”这个术语本身来源于波斯学者花喇子密(al-Khwarizmi)的拉丁译名。他于9世纪时的巴格达写就《关于印度数字计算》(On the Calculation with Hindu Numerals),被视为最早将印度数字概念,以及与之相随的新的计算技巧(算法)引入西方的著作。事实上,中世纪拉丁词语“algorismus”正是指的关于印度数字四则计算的过程和它的简称。后来,“算法”这一术语在比喻意义上表示任何按照步骤进行的逻辑过程,并且成为了计算机逻辑的内核。广义上讲,我们可以将算法的历史分为三个阶段:在古代,“算法”可以被认为是根据过程或编码方式执行的仪式,以达到非常具体的目的,并将这套规则传承下去;在中世纪,算法指的是帮助数学操作的一种过程;在当代,算法和算法逻辑过程实现了整体性的机械化,由机器和数字计算机自动化地执行。




3. “机器学习作为计算空间”的兴起

1957年,纽约州水牛城的康奈尔航空实验室里,认知科学家弗兰克·罗森布拉特(Frank Rosenblatt)发明和建构了“感知机”(Perceptron),这是已知的首个可运行的人工神经网络——可谓几乎所有机器学习模型的祖母。在发明之际,它被分类为军事机密。12 感知机的第一个原型是一台模拟信号的电脑,一台包含20X20个光电池(名为“视网膜”)的输入设备通过电线连接到一层人造神经元,计算输出唯一的结果(一个通过明灭象征0或1的灯泡)。感知机的“视网膜”录制下简单的形状,比如字母或者三角形,再把电信号传到一簇“神经元”,后者根据阈值逻辑计算出一个特定的结果。感知机有点像某种可以被训练识别特定形状的照相机:比如说,在有一定错误边际的基础上,做出决定(这使得它成为了一台“智能”机器)。感知机是最早的机器学习算法,一个基本的“二元分类器”,能够决定一个图形是否属于特定的判定类别(亦即,输入图像是不是一个三角形,是不是一个方形,如是等等)。为了实现这种能力,感知机需要不断调整节点的值,以分解一个大的数字输入(是一个由400个数字组成的空间矩阵),把它变成一个简单的二进制输出(0或1)。如果输入特征符合一个特定的分类(比如三角形),那么感知机则输出1,否则则输出0。在一开始的时候,感知机需要一个人类操作者来训练,以学会正确的答案(“训练”指的是人工的把输出节点调成0或1),这一行动的意图是让这个机器通过监督训练,能在未来有能力辨识出类似的形状。感知机不是被设计来记忆某一个特定形状的,而是用来学习识别任何潜在可能的形状。

第一台感知机的20X20光电池矩阵,是一次悄无声息的计算革命的源头(到了二十一世纪初期,随着一种机器学习类型“深度学习”的疾速发展,成为了一种主导范式)。尽管感知机的“神经元”是受到生物神经细胞的启发,但从严格逻辑上讲,感知机并不是对“计算”概念向生物拟仿的转变,而是对拓扑的拟仿;它预示了一种“计算空间”(computational space)或者“自计算空间”(self-computing space)范式的崛起。这一转向给“计算”的范式引入了一种空间维度,到当时位置,计算原本都是线性的(比如图灵机是在一根线型存储带上写入0或1)。这种拓扑的转向,正是今天人们所认为之“人工智能”的内核,它可被更谨慎地描述为从被动信息到主动信息范式之间的越迁。感知机并非运用过一个从上至下的算法处理一个视觉矩阵(就像任何图形处理软件的编辑原理一样),而是从下而上地把视觉矩阵的每一个像素,根据它原本的空间归置,进行计算。所有这些视觉数据的空间关系塑造了计算它们的算法的操作形式。

正是因为这种空间逻辑,这类最早专注于神经网络的计算机科学分支在当时被称为“计算几何学”。计算空间或自计算空间的范式,和二战后盛行的控制论中的“自组织”原则有相似性,比如冯·诺伊曼(Von Neumann)的细胞自动机(cellular automata, 1948)和康拉德·楚泽(Konrad Zuse)的“计算空间”(Rechnender Raum,1967)。13细胞自动机被用来模拟自然演化,并研究复杂的生物系统,但它们仍然是在有限空间里的有限计算。康拉德·楚泽 (他于1938年在柏林建造了第一台可编程计算机)尝试把细胞自动机的逻辑延展到物理学乃至整个宇宙。他提出“计算空间”(rechnender Raum)的概念,这是一个由独立单元组成的宇宙,每一个独立单元的行为都由它周围的单元决定。阿兰·图灵最后一篇论文《形态发生的化学基础》(The Chemical Basis of Morphogenesis,出版于1952年,他去世前2年)也研究了自计算结构。14 图灵认为生物系统里的分子是自计算的行动者,它们可以用来解释极为复杂的自下而上的结构,比如水螅触手的纹样,植物的螺纹,晶胚的原肠胚形成,动物表皮的斑点,和花卉的叶序。15

·诺伊曼的细胞自动机和楚泽的“计算空间”,作为一种空间模型而言,非常直观。而罗森布拉特的神经网络则呈现出更为复杂的空间结构。诚然,神经网络运用了极其复杂的组合结构,这或许也是它们在机器学习中呈现出最显著效率的原因。神经网络据称可以“解决任何问题”,这意味着它们可以通过万能近似定理(Universal Approximation theorem)去趋近任何一种规律的运行(只要它们拥有足够多层的神经元和运算资源)。所有的机器学习系统,包括支持向量机、马尔可夫链、霍普菲尔德网络、玻尔兹曼机和卷积神经网络等等,都是从计算几何学发源的。从这个意义上讲,它们都源自于问题组合术(ars combinatoria)的历史传统。16

图片来源:汉斯·梅因哈特,《海螺的算术之美》(Springer Science & Business Media, 2009)。

4. 视觉劳动的自动化

哪怕到了二十世纪末叶,也不会有任何人会把卡车司机称作一个“认知工人”(cognitive worker),或者一个知识分子。在二十一世纪初期,机器学习被运用在自动驾驶领域,这促生了一种对包括“驾驶”在内的人力劳动的新的理解。这也揭示了另一个事实。那就是人类工作最有价值的组成部分从不是纯人力的,而带有社会性和认知性(包括感知性,这是仍然需要在人力和认知之间更好定位的一种劳动要素)。 司机执行什么样的工作?人工智能会用它的感应器记录什么样的人类任务,用它的统计学模型进行模仿,并且进而用自动化去替代?或许最好的回答这一问题的方式是去观察技术迄今为止已经成功地“自动化”了什么,而哪些领域尚未实现同等级别的自动化。

自动驾驶作为一个产业项目清晰地说明了(或许比一千本政治经济学书籍还清楚)一点:驾驶劳动是一个有意识的行动,它遵循一系列编撰的规则和本能反应的社会传统。然而,驾驶这一技能可以被翻译成一种算法,这是因为驾驶行为有一套逻辑和推理结构。驾驶是一种逻辑行为,正如劳动广义上也是一种逻辑行为。这一假设也有助于重新审视关于“体力劳动”和“智力劳动”之间那陈腐的划分方式的争议。17  许多企业在人工智能自动化算法方向的发展,也使得我们有能力把劳动视为一种认知元素,这在很长时期内是被批判性理论所忽略的,当然也成为了某种政治悖论。“劳动”和“逻辑”之间的关系是什么?这成为了人工智能时代最关键的哲学问题之一。


很显然,驾驶行为需要一种高度集中的感知劳动。事实上,自然界中的许多劳动个都是“感知性”的,它需要通过持续的,转瞬之间的决策行为和认知行为来实现。18 认知不能完全从空间逻辑本身中剥离开来,在认知的抽象建构中,它也遵循某种空间逻辑。“感知是有逻辑的”和“认知是有空间性的”这两种观察,都得到了一定的经验性证明,这不是单纯地来自自动驾驶算法的自我宣传。这些算法会架构能在统计学上推导视觉空间的模型(通常会被编码成一个有三维路面场景的数字影像)。除此之外,自动驾驶车里面人工智能系统所提到的那个“司机”,并不是一个个体,而是一个集体工人,一个“社会脑”,在城市和世界里巡航。19 如果我们观察那些自动驾驶项目,会发现,人工智能是借助集体数据的,这些数据编码了一种对于空间、时间、劳动和社会关系的整体生产。人工智能所模仿、替代和萌生的,是一种社会空间的组织化分区(它首先是对物质材料的运算,而不是发生在抽象世界的数学方程或分析)。


5. 空间的智能和记忆

提出关于速度或速学(dromology)概念的法国哲学家保罗·维希留(Paul Virilio)进行关于空间和拓扑理论的研究,因为他知道科技加速了人类对于空间的感知,正如它扭曲了对时间的认知。非常有意思的是,维希留的书《视觉机器》,其标题正是受到了罗森布拉特感知机的启发。 维希留是一个博闻强识的,古典的二十世纪思想家,他建立了古代基于空间概念的记忆方法(比如轨迹法)和近现代计算机的空间矩阵记忆方法之间的清晰线索。

西塞罗和其他的古代“记忆理论家”相信,人类可以通过正确的训练方式加强自然记忆能力。他们发明了一套基于拓扑学的系统,亦即“轨迹法”(Method of Loci),它指的是一种想象图景式的记忆术,它涉及对一系列地点和位置的选择,并对其进行时空排布。举例来说,在这种记忆方法的场景里,你或许会想象在一个屋子里自由行走,选择不同的桌子,通过门廊看见一把椅子,一个窗台,并在墙上 写写画画。接下来,需要被记忆的素材会被编码进独立的图像,而这些图像以特定的顺序,被安插在不同轨迹里。如果你需要记住一段演讲,你需要把关键点提炼出来,转译成图形,并在思想中把这些关键点“放置”在连续的轨迹里。当你真正需要发表演讲时,你只需要按照顺序,回忆起你放置它们的这个房间即可。把空间,拓扑坐标和几何比例转译成一种记忆方法,和今天我们把集体空间转译成机器智能的来源,有异曲同工之妙。 维希留在书的结尾处回顾了在包括感知机在内的“视觉机器”时代,图像所处的地位,他也提出了某种警示:正在逼近的人工智能时代是“视觉的工业化”。

“没有物能感知我”,画家保罗·克里(Paul Klee)曾在他的手稿中写下这么一句话。这个当是颇为惊人的陈述似乎近来成为了一个客观事实,某种真相。毕竟,难道人们不正是在讨论在近未来制造出某种“视觉机器”,它不仅能识别轮廓和形状,也能极其完整地解释整个视觉领域?难道人们不正是在讨论这样一种视觉科学新技术:一台电脑能控制摄像头,以让人们实现“不可见的”视觉?此类的技术可以被运用在工业生产,库存管理,军用机器人等各种领域。


1501年, 《赫雷乌德球论》,罗马国立中央图书馆。图片来自网络。

6. 结语

当我们回溯火坛祭里的古代几何学,最早的神经网络感知机的计算矩阵,和自动驾驶工具复杂的导航系统,或许这些不同的空间逻辑能共同厘清算法如何作为一种形式浮现,而非一种技术演绎。火坛祭是“涌现”算法的一个例子,在于它对社会和宗教仪式空间的组织方式进行的编码。这种仪式的象征性功能,是通过寻常的方式重构神明;这种重构实践也象征着在“一”里对“多”的表述(或者通过“多”,进行对“一”的计算)。宗教仪式的社会功能之一也是教育实践者基础的几何功能,来搭建坚固的房屋。21 火坛祭也是一种算法思考的形式,它遵循特定的原始逻辑,和直观的计算几何学。

感知机也是一种涌现算法,它通过对空间的分割,尤其是对视觉数据的空间矩阵排列,进行编码。感知机的光感受器矩阵定义了一个闭合域,并且运行了一种能根据数据的空间关系对其进行运算的算法。在这里,算法也呈现为一种涌现过程——某一进程或规律经过不断的重复被整理和清晰化。所有的机器学习算法都是涌现过程,在过程中,类似规律的反复出现将“教会”机器,规律也成为一种统计学分布。22 自动驾驶车是此类复杂涌现算法的案例,它发源于一种对空间的复杂建构,亦即把道路环境视为交通代码和本能规律的社会建制。这些自动驾驶算法把本能规律和特定地点的交通代码记录下来后,试图预测在一个繁忙的街道上可能会发生的事情。在自动驾驶的语境里,算法公司对于“自动化乌托邦”的想象是不再需要人类司机,道路场景的视觉空间本身会决定地图如何导航。

火坛祭,感知机和自动驾驶的人工智能系统,在不同意义上都建立了自计算空间和涌现算法(也许这所有都属于劳动的“不可见化”形式)。计算空间或者自计算空间的概念也尤其强调了机器学习算法和人工智能都属于涌现系统,基于某种寻常的,对时间,空间,劳动和社会关系的物质性的区分。机器学习从古代对边界和身体进行标注,对人和货物进行计数等抽象方法和仪式所构成的网格之间涌现出来。从这个层面上,机器学习是从社会劳动的延伸中涌现出来的。尽管它通常遭遇框限和批判,人工智能并非纯粹“人工”或“异质”的:在常见的意识形态神秘化过程中,它呈现为一种像古代剧场里的“天降之神”(deus ex machina)的状态。但这种论述其实也掩盖了一个现实:人工智能事实上是从世界的智能中浮现出来的。



© 2019 e-flux and the author