EXTENDED READING 延伸阅读

Anatomy of an AI System



The Amazon Echo as an anatomical map of human labor, data and planetary resources
作为人工劳动力,数据,和资源剖析图的亚马逊Echo


︎

Text Kate Crawford & Vladan Joler 文 凯特· 克劳福德 & 瓦拉丹·卓勒








A cylinder sits in a room. It is impassive, smooth, simple and small. It stands 14.8cm high, with a single blue-green circular light that traces around its upper rim. It is silently attending. A woman walks into the room, carrying a sleeping child in her arms, and she addresses the cylinder.

‘Alexa, turn on the hall lights’

The cylinder springs into life. ‘OK.’ The room lights up. The woman makes a faint nodding gesture, and carries the child upstairs.

This is an interaction with Amazon’s Echo device. 3 A brief command and a response is the most common form of engagement with this consumer voice-enabled AI device. But in this fleeting moment of interaction, a vast matrix of capacities is invoked: interlaced chains of resource extraction, human labor and algorithmic processing across networks of mining, logistics, distribution, prediction and optimization. The scale of this system is almost beyond human imagining. How can we begin to see it, to grasp its immensity and complexity as a connected form? We start with an outline: an exploded view of a planetary system across three stages of birth, life and death, accompanied by an essay in 21 parts. Together, this becomes an anatomical map of a single AI system.

II


The scene of the woman talking to Alexa is drawn from a 2017 promotional video advertising the latest version of the Amazon Echo. The video begins, “Say hello to the all-new Echo” and explains that the Echo will connect to Alexa (the artificial intelligence agent) in order to “play music, call friends and family, control smart home devices, and more.” The device contains seven directional microphones, so the user can be heard at all times even when music is playing. The device comes in several styles, such as gunmetal grey or a basic beige, designed to either “blend in or stand out.” But even the shiny design options maintain a kind of blankness: nothing will alert the owner to the vast network that subtends and drives its interactive capacities. The promotional video simply states that the range of things you can ask Alexa to do is always expanding. “Because Alexa is in the cloud, she is always getting smarter and adding new features.”

How does this happen? Alexa is a disembodied voice that represents the human-AI interaction interface for an extraordinarily complex set of information processing layers. These layers are fed by constant tides: the flows of human voices being translated into text questions, which are used to query databases of potential answers, and the corresponding ebb of Alexa’s replies. For each response that Alexa gives, its effectiveness is inferred by what happens next:

Is the same question uttered again? (Did the user feel heard?)
Was the question reworded? (Did the user feel the question was understood?)
Was there an action following the question? (Did the interaction result in a tracked response: a light turned on, a product purchased, a track played?)

With each interaction, Alexa is training to hear better, to interpret more precisely, to trigger actions that map to the user’s commands more accurately, and to build a more complete model of their preferences, habits and desires. What is required to make this possible? Put simply: each small moment of convenience – be it answering a question, turning on a light, or playing a song – requires a vast planetary network, fueled by the extraction of non-renewable materials, labor, and data. The scale of resources required is many magnitudes greater than the energy and labor it would take a human to operate a household appliance or flick a switch. A full accounting for these costs is almost impossible, but it is increasingly important that we grasp the scale and scope if we are to understand and govern the technical infrastructures that thread through our lives.

III


The Salar, the world's largest flat surface, is located in southwest Bolivia at an altitude of 3,656 meters above sea level. It is a high plateau, covered by a few meters of salt crust which are exceptionally rich in lithium, containing 50% to 70% of the world's lithium reserves. 4 The Salar, alongside the neighboring Atacama regions in Chile and Argentina, are major sites for lithium extraction. This soft, silvery metal is currently used to power mobile connected devices, as a crucial material used for the production of lithium-Ion batteries. It is known as ‘grey gold.’ Smartphone batteries, for example, usually have less than eight grams of this material. 5Each Tesla car needs approximately seven kilograms of lithium for its battery pack. 6 All these batteries have a limited lifespan, and once consumed they are thrown away as waste. Amazon reminds users that they cannot open up and repair their Echo, because this will void the warranty. The Amazon Echo is wall-powered, and also has a mobile battery base. This also has a limited lifespan and then must be thrown away as waste.

According to the Aymara legends about the creation of Bolivia, the volcanic mountains of the Andean plateau were creations of tragedy. 7 Long ago, when the volcanos were alive and roaming the plains freely, Tunupa - the only female volcano – gave birth to a baby. Stricken by jealousy, the male volcanos stole her baby and banished it to a distant location. The gods punished the volcanos by pinning them all to the Earth. Grieving for the child that she could no longer reach, Tunupa wept deeply. Her tears and breast milk combined to create a giant salt lake: Salar de Uyuni. As Liam Young and Kate Davies observe, “your smart-phone runs on the tears and breast milk of a volcano. This landscape is connected to everywhere on the planet via the phones in our pockets; linked to each of us by invisible threads of commerce, science, politics and power.” 8

IV


Our exploded view diagram combines and visualizes three central, extractive processes that are required to run a large-scale artificial intelligence system: material resources, human labor, and data. We consider these three elements across time – represented as a visual description of the birth, life and death of a single Amazon Echo unit. It’s necessary to move beyond a simple analysis of the relationship between an individual human, their data, and any single technology company in order to contend with with the truly planetary scale of extraction. Vincent Mosco has shown how the ethereal metaphor of ‘the cloud’ for offsite data management and processing is in complete contradiction with the physical realities of the extraction of minerals from the Earth’s crust and dispossession of human populations that sustain its existence. 9 Sandro Mezzadra and Brett Nielson use the term ‘extractivism’ to name the relationship between different forms of extractive operations in contemporary capitalism, which we see repeated in the context of the AI industry. 10 There are deep interconnections between the literal hollowing out of the materials of the earth and biosphere, and the data capture and monetization of human practices of communication and sociality in AI. Mezzadra and Nielson note that labor is central to this extractive relationship, which has repeated throughout history: from the way European imperialism used slave labor, to the forced work crews on rubber plantations in Malaya, to the Indigenous people of Bolivia being driven to extract the silver that was used in the first global currency. Thinking about extraction requires thinking about labor, resources, and data together. This presents a challenge to critical and popular understandings of artificial intelligence: it is hard to ‘see’ any of these processes individually, let alone collectively. Hence the need for a visualization that can bring these connected, but globally dispersed processes into a single map.

V


If you read our map from left to right, the story begins and ends with the Earth, and the geological processes of deep time. But read from top to bottom, we see the story as it begins and ends with a human. The top is the human agent, querying the Echo, and supplying Amazon with the valuable training data of verbal questions and responses that they can use to further refine their voice-enabled AI systems. At the bottom of the map is another kind of human resource: the history of human knowledge and capacity, which is also used to train and optimize artificial intelligence systems. This is a key difference between artificial intelligence systems and other forms of consumer technology: they rely on the ingestion, analysis and optimization of vast amounts of human generated images, texts and videos.

VI


When a human engages with an Echo, or another voice-enabled AI device, they are acting as much more than just an end-product consumer. It is difficult to place the human user of an AI system into a single category: rather, they deserve to be considered as a hybrid case. Just as the Greek chimera was a mythological animal that was part lion, goat, snake and monster, the Echo user is simultaneously a consumer, a resource, a worker, and a product. This multiple identity recurs for human users in many technological systems. In the specific case of the Amazon Echo, the user has purchased a consumer device for which they receive a set of convenient affordances. But they are also a resource, as their voice commands are collected, analyzed and retained for the purposes of building an ever-larger corpus of human voices and instructions. And they provide labor, as they continually perform the valuable service of contributing feedback mechanisms regarding the accuracy, usefulness, and overall quality of Alexa’s replies. They are, in essence, helping to train the neural networks within Amazon’s infrastructural stack.

VII


Anything beyond the limited physical and digital interfaces of the device itself is outside of the user’s control. It presents a sleek surface with no ability to open it, repair it or change how it functions. The object itself is a very simple extrusion of plastic representing a collection of sensors – its real power and complexity lies somewhere else, far out of sight. The Echo is but an ‘ear’ in the home: a disembodied listening agent that never shows its deep connections to remote systems.

In 1673, the Jesuit polymath, Athanasius Kircher, invented the statua citofonica – the ‘talking statue.’ Kircher was an extraordinary interdisciplinary scholar and inventor. In his lifetime he published forty major works across the fields of medicine, geology, comparative religion and music. He invented the first magnetic clock, many early automatons, and the megaphone. His talking statue was a very early listening system: essentially a microphone made from a huge spiral tube, which could convey the conversations from a public square and up through the tube, and then piped through the mouth of a statue kept within an aristocrat’s private chambers. As Kircher wrote:

“This statue must be located in a given place, in order to allow the end section of the spiral-shaped tube to precisely correspond to the opening of the mouth. In this manner it will be perfect, and capable to emit clearly any kind of sound: in fact the statue will be able to speak continuously, uttering in either a human or animal voice: it will laugh or sneer; it will seem to really cry or moan; sometimes with great astonishment it will strongly blow. If the opening of the spiral shaped tube is located in correspondence to an open public space, all human words pronounced, focused in the conduit, would be replayed through the mouth of the statue.” 11

The listening system could eavesdrop on everyday conversations in the piazza, and relay them to the 17th century Italian oligarchs. Kircher’s talking statue was an early form of information extraction for the elites – people talking in the street would have no indication that their conversations were being funneled to those who would instrument that knowledge for their own power, entertainment and wealth. People inside the homes of aristocrats would have no idea how a magical statue was speaking and conveying all manner of information. The aim was to obscure how the system worked: an elegant statue was all they could see. Listening systems, even at this early stage, were about power, class, and secrecy. But the infrastructure for Kircher’s system was prohibitively expensive – available only to the very few. And so the question remains, what are the full resource implications of building such systems? This brings us to the materiality of the infrastructure that lies beneath.

VIII


In his book A Geology of Media, Jussi Parikka suggests that we try to think of media not from Marshall McLuhan’s point of view – in which media are extensions of human senses 12 – but rather as an extension of Earth. 13 Media technologies should be understood in context of a geological process, from the creation and the transformation processes, to the movement of natural elements from which media are built. Reflecting upon media and technology as geological processes enables us to consider the profound depletion of non-renewable resources required to drive the technologies of the present moment. Each object in the extended network of an AI system, from network routers to batteries to microphones, is built using elements that required billions of years to be produced. Looking from the perspective of deep time, we are extracting Earth’s history to serve a split second of technological time, in order to build devices than are often designed to be used for no more than a few years. For example, the Consumer Technology Association notes that the average smartphone lifespan is 4.7 years. 14 This obsolescence cycle fuels the purchase of more devices, drives up profits, and increases incentives for the use of unsustainable extraction practices. From a slow process of elemental development, these elements and materials go through an extraordinarily rapid period of excavation, smelting, mixing, and logistical transport – crossing thousands of kilometers in their transformation. Geological processes mark both the beginning and the end of this period, from the mining of ore, to the deposition of material in an electronic waste dump. For that reason, our map starts and ends with the Earth’s crust. However, all the transformations and movements we depict are only the barest anatomical outline: beneath these connections lie many more layers of fractal supply chains, and exploitation of human and natural resources, concentrations of corporate and geopolitical power, and continual energy consumption.

IX


Drawing out the connections between resources, labor and data extraction brings us inevitably back to traditional frameworks of exploitation. But how is value being generated through these systems? A useful conceptual tool can be found in the work of Christian Fuchs and other authors examining and defining digital labor. The notion of digital labor, which was initially linked with different forms of non-material labor, precedes the life of devices and complex systems such as artificial intelligence. Digital labor – the work of building and maintaining the stack of digital systems – is far from ephemeral or virtual, but is deeply embodied in different activities. 15 The scope is overwhelming: from indentured labor in mines for extracting the minerals that form the physical basis of information technologies; to the work of strictly controlled and sometimes dangerous hardware manufacturing and assembly processes in Chinese factories; to exploited outsourced cognitive workers in developing countries labelling AI training data sets; to the informal physical workers cleaning up toxic waste dumps. These processes create new accumulations of wealth and power, which are concentrated in a very thin social layer.

X


This triangle of value extraction and production represents one of the basic elements of our map, from birth in a geological process, through life as a consumer AI product, and ultimately to death in an electronics dump. Like in Fuchs’ work, our triangles are not isolated, but linked to one another in the production process. They form a cyclic flow in which the product of work is transformed into a resource, which is transformed into a product, which is transformed into a resource and so on. Each triangle represents one phase in the production process. Although this appears on the map as a linear path of transformation, a different visual metaphor better represents the complexity of current extractivism: the fractal structure known as the Sierpinski triangle.

A linear display does not enable us to show that each next step of production and exploitation contains previous phases. If we look at the production and exploitation system through a fractal visual structure, the smallest triangle would represent natural resources and means of labor, i.e. the miner as labor and ore as product. The next larger triangle encompasses the processing of metals, and the next would represent the process of manufacturing components and so on. The ultimate triangle in our map, the production of the Amazon Echo unit itself, includes all of these levels of exploitation – from the bottom to the very top of Amazon Inc, a role inhabited by Jeff Bezos as CEO of Amazon. Like a pharaoh of ancient Egypt, he stands at the top of the largest pyramid of AI value extraction.

XI


To return to the basic element of this visualization – a variation of Marx’s triangle of production – each triangle creates a surplus of value for creating profits. If we look at the scale of average income for each activity in the production process of one device, which is shown on the left side of our map, we see the dramatic difference in income earned. According to research by Amnesty International, during the excavation of cobalt which is also used for lithium batteries of 16 multinational brands, workers are paid the equivalent of one US dollar per day for working in conditions hazardous to life and health, and were often subjected to violence, extortion and intimidation. 16 Amnesty has documented children as young as 7 working in the mines. In contrast, Amazon CEO Jeff Bezos, at the top of our fractal pyramid, made an average of $275 million a day during the first five months of 2018, according to the Bloomberg Billionaires Index. 17 A child working in a mine in the Congo would need more than 700,000 years of non-stop work to earn the same amount as a single day of Bezos’ income.

Many of the triangles shown on this map hide different stories of labor exploitation and inhumane working conditions. The ecological price of transformation of elements and income disparities is just one of the possible ways of representing a deep systemic inequality. We have both researched different forms of ‘black boxes’ understood as algorithmic processes, 18 but this map points to another form of opacity: the very processes of creating, training and operating a device like an Amazon Echo is itself a kind of black box, very hard to examine and track in toto given the multiple layers of contractors, distributors, and downstream logistical partners around the world. As Mark Graham writes, “contemporary capitalism conceals the histories and geographies of most commodities from consumers. Consumers are usually only able to see commodities in the here and now of time and space, and rarely have any opportunities to gaze backwards through the chains of production in order to gain knowledge about the sites of production, transformation, and distribution.” 19

One illustration of the difficulty of investigating and tracking the contemporary production chain process is that it took Intel more than four years to understand its supply line well enough to ensure that no tantalum from the Congo was in its microprocessor products. As a semiconductor chip manufacturer, Intel supplies Apple with processors. In order to do so, Intel has its own multi-tiered supply chain of more than 19,000 suppliers in over 100 countries providing direct materials for their production processes, tools and machines for their factories, and logistics and packaging services. 20 That it took over four years for a leading technology company just to understand its own supply chain, reveals just how hard this process can be to grasp from the inside, let alone for external researchers, journalists and academics. Dutch-based technology company Philips has also claimed that it was working to make its supply chain 'conflict-free'. Philips, for example, has tens of thousands of different suppliers, each of which provides different components for their manufacturing processes. 21Those suppliers are themselves linked downstream to tens of thousands of component manufacturers that acquire materials from hundreds of refineries that buy ingredients from different smelters, which are supplied by unknown numbers of traders that deal directly with both legal and illegal mining operations. In The Elements of Power, David S. Abraham describes the invisible networks of rare metals traders in global electronics supply chains: “The network to get rare metals from the mine to your laptop travels through a murky network of traders, processors, and component manufacturers. Traders are the middlemen who do more than buy and sell rare metals: they help to regulate information and are the hidden link that helps in navigating the network between metals plants and the components in our laptops.” 22 According to the computer manufacturing company Dell, complexities of the metal supply chain pose almost insurmountable challenges. 23 The mining of these minerals takes place long before a final product is assembled, making it exceedingly difficult to trace the minerals' origin. In addition, many of the minerals are smelted together with recycled metals, by which point it becomes all but impossible to trace the minerals to their source. So we see that the attempt to capture the full supply chain is a truly gargantuan task: revealing all the complexity of the 21st century global production of technology products.

XII


Supply chains are often layered on top of one another, in a sprawling network. Apple’s supplier program reveals there are tens of thousands of individual components embedded in their devices, which are in turn supplied by hundreds of different companies. In order for each of those components to arrive on the final assembly line where it will be assembled by workers in Foxconn facilities, different components need to be physically transferred from more than 750 supplier sites across 30 different countries. 24 This becomes a complex structure of supply chains within supply chains, a zooming fractal of tens of thousands of suppliers, millions of kilometers of shipped materials and hundreds of thousands of workers included within the process even before the product is assembled on the line.

Visualizing this process as one global, pancontinental network through which materials, components and products flow, we see an analogy to the global information network. Where there is a single internet packet travelling to an Amazon Echo, here we can imagine a single cargo container. 25 The dizzying spectacle of global logistics and production will not be possible without the invention of this simple, standardized metal object. Standardized cargo containers allowed the explosion of modern shipping industry, which made it possible to model the planet as a massive, single factory. In 2017, the capacity of container ships in seaborne trade reached nearly 250,000,000 dead-weight tons of cargo, dominated by giant shipping companies like Maersk of Denmark, the Mediterranean Shipping Company of Switzerland, and France’s CMA CGM Group, each owning hundred of container vessels. 26 For these commercial ventures, cargo shipping is a relatively cheap way to traverse the vascular system of the global factory, yet it disguises much larger external costs.

In recent years, shipping boats produce 3.1% of global yearly CO2 emissions, more than the entire country of Germany. 27 In order to minimize their internal costs, most of the container shipping companies use very low grade fuel in enormous quantities, which leads to increased amounts of sulphur in the air, among other toxic substances. It has been estimated that one container ship can emit as much pollution as 50 million cars, and 60,000 deaths worldwide are attributed indirectly to cargo ship industry pollution related issues annually. 28Even industry-friendly sources like the World Shipping Council admit that thousands of containers are lost each year, on the ocean floor or drifting loose. 29 Some carry toxic substances which leak into the oceans. Typically, workers spend 9 to 10 months in the sea, often with long working shifts and without access to external communications. Workers from the Philippines represent more than a third of the global shipping workforce. 30 The most severe costs of global logistics are born by the atmosphere, the oceanic ecosystem and all it contains, and the lowest paid workers.

XIII


The increasing complexity and miniaturization of our technology depends on the process that strangely echoes the hopes of early medieval alchemy. Where medieval alchemists aimed to transform base metals into ‘noble’ ones, researchers today use rare earth metals to enhance the performance of other minerals. There are 17 rare earth elements, which are embedded in laptops and smartphones, making them smaller and lighter. They play a role in color displays, loudspeakers, camera lenses, GPS systems, rechargeable batteries, hard drives and many other components. They are key elements in communication systems from fiber optic cables, signal amplification in mobile communication towers to satellites and GPS technology. But the precise configuration and use of these minerals is hard to ascertain. In the same way that medieval alchemists hid their research behind cyphers and cryptic symbolism, contemporary processes for using minerals in devices are protected behind NDAs and trade secrets.

The unique electronic, optical and magnetic characteristics of rare earth elements cannot be matched by any other metals or synthetic substitutes discovered to date. While they are called ‘rare earth metals’, some are relatively abundant in the Earth’s crust, but extraction is costly and highly polluting. David Abraham describes the mining of dysprosium and Terbium used in a variety of high-tech devices in Jianxi, China. He writes, “Only 0.2 percent of the mined clay contains the valuable rare earth elements. This means that 99.8 percent of earth removed in rare earth mining is discarded as waste called “tailings” that are dumped back into the hills and streams,” creating new pollutants like ammonium. 31 In order to refine one ton of rare earth elements, “the Chinese Society of Rare Earths estimates that the process produces 75,000 liters of acidic water and one ton of radioactive residue.” 32 Furthermore, mining and refining activities consume vast amount of water and generate large quantities of CO2 emissions. In 2009, China produced 95% of the world's supply of these elements, and it has been estimated that the single mine known as Bayan Obo contains 70% of the world's reserves. 33

XIV


A satellite picture of the tiny Indonesian island of Bangka tells a story about human and environmental toll of the semiconductor production. On this tiny island, mostly ‘informal’ miners are on makeshift pontoons, using bamboo poles to scrape the seabed, and then diving underwater to suck tin from the surface through giant, vacuum-like tubes. As a Guardian investigation reports “tin mining is a lucrative but destructive trade that has scarred the island's landscape, bulldozed its farms and forests, killed off its fish stocks and coral reefs, and dented tourism to its pretty palm-lined beaches. The damage is best seen from the air, as pockets of lush forest huddle amid huge swaths of barren orange earth. Where not dominated by mines, this is pockmarked with graves, many holding the bodies of miners who have died over the centuries digging for tin.” 34 Two small islands, Bangka and Belitung, produce 90% of Indonesia's tin, and Indonesia is the world's second-largest exporter of the metal. Indonesia's national tin corporation, PT Timah, supplies companies such as Samsung directly, as well as solder makers Chernan and Shenmao, which in turn supply Sony, LG and Foxconn. 35

XV


At Amazon distribution centers, vast collections of products are arrayed in a computational order across millions of shelves. The position of every item in this space is precisely determined by complex mathematical functions that process information about orders and create relationships between products. The aim is to optimize the movements of the robots and humans that collaborate in these warehouses. With the help from an electronic bracelet, the human worker is directed though warehouses the size of airplane hangars, filled with objects arranged in an opaque algorithmic order. 36

Hidden among the thousands of other publicly available patents owned by Amazon, U.S. patent number 9,280,157 represents an extraordinary illustration of worker alienation, a stark moment in the relationship between humans and machines. 37 It depicts a metal cage intended for the worker, equipped with different cybernetic add-ons, that can be moved through a warehouse by the same motorized system that shifts shelves filled with merchandise. Here, the worker becomes a part of a machinic ballet, held upright in a cage which dictates and constrains their movement.

As we have seen time and time again in the research for our map, dystopian futures are built upon the unevenly distributed dystopian regimes of the past and present, scattered through an array of production chains for modern technical devices. The vanishingly few at the top of the fractal pyramid of value extraction live in extraordinary wealth and comfort. But the majority of the pyramids are made from the dark tunnels of mines, radioactive waste lakes, discarded shipping containers, and corporate factory dormitories.

XVI


At the end of 19th century, a particular Southeast Asian tree called palaquium gutta became the center of a technological boom. These trees, found mainly in Malaysia, produce a milky white natural latex called gutta percha. After English scientist Michael Faraday published a study in The Philosophical Magazine in 1848 about the use of this material as an electrical insulator, gutta percha rapidly became the darling of the engineering world. It was seen as the solution to the problem of insulating telegraphic cables in order that they could withstand the conditions of the ocean floor. As the global submarine business grew, so did demand for palaquium gutta tree trunks. The historian John Tully describes how local Malay, Chinese and Dayak workers were paid little for the dangerous works of felling the trees and slowly collecting the latex. 38 The latex was processed then sold through Singapore’s trade markets into the British market, where it was transformed into, among other things, lengths upon lengths of submarine cable sheaths.

A mature palaquium gutta could yield around 300 grams of latex. But in 1857, the first transatlantic cable was around 3000 km long and weighed 2000 tons – requiring around 250 tons of gutta percha. To produce just one ton of this material required around 900,000 tree trunks. The jungles of Malaysia and Singapore were stripped, and by the early 1880s the palaquium gutta had vanished. In a last-ditch effort to save their supply chain, the British passed a ban in 1883 to halt harvesting the latex, but the tree was already extinct. 39

The Victorian environmental disaster of gutta percha, from the early origins of the global information society, shows how the relationships between technology and its materiality, environments, and different forms of exploitation are imbricated. Just as Victorians precipitated ecological disaster for their early cables, so do rare earth mining and global supply chains further imperil the delicate ecological balance of our era. From the material used to build the technology enabling contemporary networked society, to the energy needed for transmitting, analyzing, and storing the data flowing through the massive infrastructure, to the materiality of infrastructure: these deep connections and costs are more significant, and have a far longer history, than is usually represented in the corporate imaginaries of AI. 40

XVII


Large-scale AI systems consume enormous amounts of energy. Yet the material details of those costs remain vague in the social imagination. It remains difficult to get precise details about the amount of energy consumed by cloud computing services. A Greenpeace report states: “One of the single biggest obstacles to sector transparency is Amazon Web Services (AWS). The world's biggest cloud computer company remains almost completely non-transparent about the energy footprint of its massive operations. Among the global cloud providers, only AWS still refuses to make public basic details on the energy performance and environmental impact associated with its operations.” 41

As human agents, we are visible in almost every interaction with technological platforms. We are always being tracked, quantified, analyzed and commodified. But in contrast to user visibility, the precise details about the phases of birth, life and death of networked devices are obscured. With emerging devices like the Echo relying on a centralized AI infrastructure far from view, even more of the detail falls into the shadows.

While consumers become accustomed to a small hardware device in their living rooms, or a phone app, or a semi-autonomous car, the real work is being done within machine learning systems that are generally remote from the user and utterly invisible to her. In many cases, transparency wouldn’t help much – without forms of real choice, and corporate accountability, mere transparency won’t shift the weight of the current power asymmetries. 42

The outputs of machine learning systems are predominantly unaccountable and ungoverned, while the inputs are enigmatic. To the casual observer, it looks like it has never been easier to build AI or machine learning-based systems than it is today. Availability of open-source tools for doing so in combination with rentable computation power through cloud superpowers such as Amazon (AWS), Microsoft (Azure), or Google (Google Cloud) is giving rise to a false idea of the ‘democratization’ of AI. While ‘off the shelf’ machine learning tools, like TensorFlow, are becoming more accessible from the point of view of setting up your own system, the underlying logics of those systems, and the datasets for training them are accessible to and controlled by very few entities. In the dynamic of dataset collection through platforms like Facebook, users are feeding and training the neural networks with behavioral data, voice, tagged pictures and videos or medical data. In an era of extractivism, the real value of that data is controlled and exploited by the very few at the top of the pyramid.

XVIII


When massive data sets are used to train AI systems, the individual images and videos involved are commonly tagged and labeled. 43 There is much to be said about how this labelling process abrogates and crystallizes meaning, and further, how this process is driven by clickworkers being paid fractions of a cent for this digital piecework.

In 1770, Hungarian inventor Wolfgang von Kempelen constructed a chess-playing machine known as the Mechanical Turk. His goal, in part, was to impress Empress Maria Theresa of Austria. This device was capable of playing chess against a human opponent and had spectacular success winning most of the games played during its demonstrations around Europe and the Americas for almost nine decades. But the Mechanical Turk was an illusion that allowed a human chess master to hide inside the machine and operate it. Some 160 years later, Amazon.com branded its micropayment based crowdsourcing platform with the same name. According to Ayhan Aytes, Amazon’s initial motivation to build Mechanical Turk emerged after the failure of its artificial intelligence programs in the task of finding duplicate product pages on its retail website.44 After a series of futile and expensive attempts, the project engineers turned to humans to work behind computers within a streamlined web-based system. 45 Amazon Mechanical Turk digital workshop emulates artificial intelligence systems by checking, assessing and correcting machine learning processes with human brainpower. With Amazon Mechanical Turk, it may seem to users that an application is using advanced artificial intelligence to accomplish tasks. But it is closer to a form of ‘artificial artificial intelligence’, driven by a remote, dispersed and poorly paid clickworker workforce that helps a client achieve their business objectives. As observed by Aytes, “in both cases [both the Mechanical Turk from 1770 and the contemporary version of Amazon’s service] the performance of the workers who animate the artifice is obscured by the spectacle of the machine.” 46

This kind of invisible, hidden labor, outsourced or crowdsourced, hidden behind interfaces and camouflaged within algorithmic processes is now commonplace, particularly in the process of tagging and labeling thousands of hours of digital archives for the sake of feeding the neural networks. Sometimes this labor is entirely unpaid, as in the case of the Google’s reCAPTCHA. In a paradox that many of us have experienced, in order to prove that you are not artificial agent, you are forced to train Google’s image recognition AI system for free, by selecting multiple boxes that contain street numbers, or cars, or houses.

As we see repeated throughout the system, contemporary forms of artificial intelligence are not so artificial after all. We can speak of the hard physical labor of mine workers, and the repetitive factory labor on the assembly line, of the cybernetic labor in distribution centers and the cognitive sweatshops full of outsourced programmers around the world, of the low paid crowdsourced labor of Mechanical Turk workers, or the unpaid immaterial work of users. At every level contemporary technology is deeply rooted in and running on the exploitation of human bodies.

XIX


In his one-paragraph short story "On Exactitude in Science", Jorge Luis Borges presents us with an imagined empire in which cartographic science became so developed and precise, that it needed a map on the same scale as the empire itself. 47

“...In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those Unconscionable Maps no longer satisfied, and the Cartographers Guilds struck a Map of the Empire whose size was that of the Empire, and which coincided point for point with it. The following Generations, who were not so fond of the Study of Cartography as their Forebears had been, saw that that vast map was Useless, and not without some Pitilessness was it, that they delivered it up to the Inclemencies of Sun and Winters. In the Deserts of the West, still today, there are Tattered Ruins of that Map, inhabited by Animals and Beggars; in all the Land there is no other Relic of the Disciplines of Geography.”

Current machine learning approaches are characterized by an aspiration to map the world, a full quantification of visual, auditory, and recognition regimes of reality. From cosmological model for the universe to the world of human emotions as interpreted through the tiniest muscle movements in the human face, everything becomes an object of quantification. Jean-François Lyotard introduced the phrase “affinity to infinity” to describe how contemporary art, techno-science and capitalism share the same aspiration to push boundaries towards a potentially infinite horizon. 48 The second half of the 19th century, with its focus on the construction of infrastructure and the uneven transition to industrialized society, generated enormous wealth for the small number of industrial magnates that monopolized exploitation of natural resources and production processes.

The new infinite horizon is data extraction, machine learning, and reorganizing information through artificial intelligence systems of combined human and machinic processing. The territories are dominated by a few global mega-companies, which are creating new infrastructures and mechanisms for the accumulation of capital and exploitation of human and planetary resources.

Such unrestrained thirst for new resources and fields of cognitive exploitation has driven a search for ever deeper layers of data that can be used to quantify the human psyche, conscious and unconscious, private and public, idiosyncratic and general. In this way, we have seen the emergence of multiple cognitive economies from the attention economy, 49 the surveillance economy, the reputation economy, 50 and the emotion economy, as well as the quantification and commodification of trust and evidence through cryptocurrencies.

Increasingly, the process of quantification is reaching into the human affective, cognitive, and physical worlds. Training sets exist for emotion detection, for family resemblance, for tracking an individual as they age, and for human actions like sitting down, waving, raising a glass, or crying. Every form of biodata – including forensic, biometric, sociometric, and psychometric – are being captured and logged into databases for AI training. That quantification often runs on very limited foundations: datasets like AVA which primarily shows women in the ‘playing with children’ action category, and men in the ‘kicking a person’ category. The training sets for AI systems claim to be reaching into the fine-grained nature of everyday life, but they repeat the most stereotypical and restricted social patterns, re-inscribing a normative vision of the human past and projecting it into the human future.

XX


"The 'enclosure' of biodiversity and knowledge is the final step in a series of enclosures that began with the rise of colonialism. Land and forests were the first resources to be 'enclosed' and converted from commons to commodities. Later on, water resources were 'enclosed' through dams, groundwater mining and privatization schemes. Now it is the turn of biodiversity and knowledge to be 'enclosed' through intellectual property rights (IPRs),” Vandana Shiva explains. 51 In Shiva’s words, “the destruction of commons was essential for the industrial revolution, to provide a supply of natural resources for raw material to industry. A life-support system can be shared, it cannot be owned as private property or exploited for private profit. The commons, therefore, had to be privatized, and people's sustenance base in these commons had to be appropriated, to feed the engine of industrial progress and capital accumulation." 52

While Shiva is referring to enclosure of nature by intellectual property rights, the same process is now occurring with machine learning – an intensification of quantified nature. The new gold rush in the context of artificial intelligence is to enclose different fields of human knowing, feeling, and action, in order to capture and privatize those fields. When in November 2015 DeepMind Technologies Ltd. got access to the health records of 1.6 million identifiable patients of Royal Free hospital, we witnessed a particular form of privatization: the extraction of knowledge value. 53 A dataset may still be publicly owned, but the meta-value of the data – the model created by it – is privately owned. While there are many good reasons to seek to improve public health, there is a real risk if it comes at the cost of a stealth privatization of public medical services. That is a future where expert local human labor in the public system is augmented and sometimes replaced with centralized, privately-owned corporate AI systems, that are using public data to generate enormous wealth for the very few.

XXI


At this moment in the 21st century, we see a new form of extractivism that is well underway: one that reaches into the furthest corners of the biosphere and the deepest layers of human cognitive and affective being. Many of the assumptions about human life made by machine learning systems are narrow, normative and laden with error. Yet they are inscribing and building those assumptions into a new world, and will increasingly play a role in how opportunities, wealth, and knowledge are distributed.

The stack that is required to interact with an Amazon Echo goes well beyond the multi-layered ‘technical stack’ of data modeling, hardware, servers and networks. The full stack reaches much further into capital, labor and nature, and demands an enormous amount of each. The true costs of these systems – social, environmental, economic, and political – remain hidden and may stay that way for some time.

We offer up this map and essay as a way to begin seeing across a wider range of system extractions. The scale required to build artificial intelligence systems is too complex, too obscured by intellectual property law, and too mired in logistical complexity to fully comprehend in the moment. Yet you draw on it every time you issue a simple voice command to a small cylinder in your living room: ‘Alexa, what time is it?”

And so the cycle continues.

×

Matteo Pasquinelli (PhD) is Professor in Media Philosophy at the University of Arts and Design, Karlsruhe, where he coordinates the research group KIM (Künstliche Intelligenz und Medienphilosophie / Artificial Intelligence and Media Philosophy). For Verso he is preparing a monograph on the genealogy of artificial intelligence as division of labor, which is titled The Eye of the Master: Capital as Computation and Cognition.
© 2019 e-flux and the author





一个圆筒被放在一个房间里。它不引人注意,外形流畅、简单、小巧。高14.8厘米,有一个围绕着它上缘的蓝绿色圆形灯。它默默地倾听着。一个女人走进房间,怀里抱着一个熟睡的孩子,然后她对着这个圆筒说。

“Alexa,打开大厅的灯。”

圆筒突然有了生命,“好的。”房间随之亮了起来。女人微微点了点头,把孩子带上了楼。

这是一次与亚马逊Echo设备的交互,简短的命令和响应是这个为消费者语音所设计的AI设备最常见的交流方式。但在这么一个短暂的互动时刻,大量的性能矩阵被唤醒:交错的资源提取链条、人工、和跨越数据挖掘、安排、分配、预测和优化等多重网络的算法处理。这个系统的规模几乎超出了人类的想象。我们如何才能开始看到它,并理解它作为连接形式的无限性和复杂性呢?让我们从一个概述开始:一个包含了诞生、生命和死亡三个阶段的“全球系统”分解视图,一篇含有21个部分的文章。这些组成了一个AI系统的“解剖全图”。
 
亚马逊智能助手(原理图)





这位女士与Alexa交谈的场景取自2017年亚马逊Echo最新版本的宣传视频。视频以“向全新的Echo问好”开头并解释说Echo将连接到Alexa(人工智能代理)用以“播放音乐,呼叫朋友和家人,控制智能家居设备等等。”该设备包含七个定向麦克风,即使在播放音乐时也可以随时听到用户的声音。该设备有多种样式,例如青铜灰色或基础米色,专门为“融入感”或“脱颖而出”而设计。”但即使是闪亮的设计仍填补不了一个空白:没有任何东西会提醒其拥有者注意到这设备背后庞大的、用以驱动其互动能力的网络。宣传视频只是简单地指出消费者可以要求Alexa做的事情在不断增加。“因为Alexa在云端,她会变得越来越智能,并不断添加新功能。”

这一切是如何发生的呢?Alexa是一个无实体的拟人声音,代表着为一系列极其复杂的信息处理网络而设计的人机交互界面。这些网络收到持续不断的,宛如潮汐般的馈送:人类的声音流如潮水般涌进网络,被翻译成文字,查询和匹配数据库中的潜在答案,Alexa的回应则是这一波潮水的尾声 。对于Alexa给出的每个响应,其有效性可以从以下问题来推断:

同样的问题再次出现了吗?(用户是否感到被听见?)

问题被重新组织了吗?(用户是否觉得这个问题被理解了?)

这个问题之后是否有行动?(本次交互是否引向了可被追踪的动作:比如灯开了,产品被购买了,音乐被播放了?)

通过每一次互动,Alexa都会进行自我培训,以便更好地聆听,更准确地解释,根据用户的命令更精准地触发动作,建立一种更完整的偏好、习惯和欲望模型。是什么使得Alexa的这些行为能够成为现实?简单地说:在每一个为用户创造方便的小时刻——无论是回答问题,开灯还是播放歌曲——都需要用到庞大的,由不可再生材料,劳动力和数据所推动的巨型全球网络。其所需资源的规模比人类操作家用电器或轻弹开关所需的能量和劳动力大得多。虽然说完全计算这些成本几乎是不可能的,但为了理解和治理遍布生活中的技术基础设施,至少掌握这些资源消耗的规模和范围正在变得越来越重要。





乌尤尼盐沼是世界上最大的平坦地表,它位于玻利维亚西南部,海拔3656米。这是一个被好几米厚的盐壳所覆盖的高原,这些盐壳的锂含量异常丰富,占世界锂储量的50%至70%。撒拉尔与邻近的智利阿塔卡马地区和阿根廷一起,成为了锂开采的主要地点。这种柔软的银色金属目前被用于移动互联设备的供电,是生产锂离子电池的关键材料,被称为“灰金”。例如,智能手机电池通常含有少于8克的锂。每辆特斯拉汽车的电池组大约需要7千克锂。所有这些电池的使用寿命都有限,一旦用完,它们就会被当垃圾扔掉。亚马逊提醒用户不可以自己打开修理Echo,因为这将使保修失效。 亚马逊 Echo采用插座供电,并配有移动电池。但这也只有有限的使用寿命,用完后必须作为垃圾扔掉。

根据艾马拉人中关于玻利维亚创造的传说,安第斯高原的火山山脉源自一个悲剧。很久以前,当火山仍然活跃并自由地“漫步”平原时,图努帕(Tunupa)——唯一的女性火山——生下了一个婴儿。因为嫉妒,雄性火山偷走了她的宝宝并把它放逐到了一个遥远的地方。众神通过将火山全部钉在地球表面上来惩罚火山。图努帕为她死去的孩子而悲泣,她的眼泪和母乳结合在一起,形成了一个巨大的盐湖:乌尤尼盐沼。正如李安×杨(Liam Young)和凯特×戴维斯(Kate Davies)所说,“你的智能手机依靠火山的眼泪和母乳。这片土地通过我们口袋里的手机与地球上的任何地方连接起来; 通过无形的商业、科学、政治和权力线索与我们每个人联系起来。”







我们的分解视图结合并可视化了大规模人工智能系统运行所需的三个主要“采掘”过程:物质资源、人类劳力、和数据。我们考虑这三个因素在时间上的变化——通过对单个亚马逊Echo的诞生、生命与死亡的视觉描绘来展现。为了真正体现出采掘过程的全球规模,我们有必要越过那种对于个体、个体数据,个体技术公司三者之间关系的简单分析。文森特·莫斯科(Vincent Mosco)已经说明了“云”这种对于离线数据管理和数据处理的缥缈隐喻,与通过强征人口来开采地壳矿物的物理现实是彻底矛盾的。桑德罗·梅扎德拉(Sandro Mezzadra)和布雷特·尼尔森(Brett Nielson)用“采掘主义”这个术语来命名当代资本主义中各种采掘性操作形式之间的关系,恰恰是我们在人工智能产业的情境中不断见证的。挖空地球和生物圈的资源,与AI中的数据采集、将人类沟通和社交行为货币化具有非常深入的互连关系。 梅扎德拉和尼尔森指出,劳动是这种采掘关系的核心,这种关系在历史上一再重复:从欧洲帝国主义使用奴隶劳动,到马来亚橡胶种植园被强迫工作的工人,再到玻利维亚的土著被驱使提取在第一全球货币中使用的白银。当我们思考“采掘”的概念时,需要同时考虑到劳动力,资源和数据三个要素。这对人工智能的批判性理解和它的流行解释提出了挑战:单独“看到”任何这些过程都很难,更不用说整体性地了解。因此,需要一种可视化方式,可以将这些原本分散的流程连接整合到一个完整图景中。







如果您从左到右阅读我们的地图,整个故事以地球开始,也以地球结束,并伴随着跨越深度时间的地质过程。但是如果从顶部读到底部,我们会看到:故事的开头和结尾都是人。顶部是人类主体,向Echo提问,并向亚马逊提供有价值的口头问题训练数据,以及能被用来进一步完善AI语音系统的回答。地图的底部是另一种人力资源:人类知识和能力的历史,也被用于训练和优化人工智能系统。这是人工智能系统与其他形式的消费者技术之间的关键区别:它们依赖于大量人类生成的图像、文本和视频的摄取,分析和优化。







当人类使用Echo或其他支持语音的AI设备时,他们的身份不仅仅是产品终端的消费者。我们很难将人工智能系统的人类用户置于单一类别中:相反,他们应该被视为混合体。正如希腊神兽喀迈拉是由狮子、山羊、蛇和怪物混合起来的神话动物一样,Echo用户是消费者,是资源本身,是工人,也是产品。人类用户具有多重身份这一情况不断地出现在各种技术体系之中。具体到亚马逊Echo这一案例中,用户购买了一个针对消费者的设备,获得了一系列方便的功能。但他们也是一种资源,因为他们的语音命令被收集,分析和保留,以便建立一个更大的人类语音和指令语料库。他们提供劳动力,因为他们不断提供有价值的服务,提供有关Alexa回复准确性、实用性和整体质量的反馈机制。从本质上讲,他们在帮助训练亚马逊基础设施堆栈之内的神经网络。





超出设备的物理及数字界面之外的任何东西,都不在用户的控制范围之内。Echo有着光滑的表面,无法打开,无法修复,也无法改变其内在功能。圆筒本身是一个非常简单的集合了传感器的塑料产品——其真正的能力和复杂性位于远远看不见的地方。 Echo只是家中的一个 “耳朵”:一个无实体的、从未显示出与远程系统深层联系的听觉代理。

1673年,耶稣会的博学家阿塔纳修斯·基歇尔(Athanasius Kircher)发明了对讲雕像- 即“会说话的雕像” 。基歇尔是一位非凡的跨学科学者和发明家。在其一生中,他出版了多达四十件主要作品,覆盖医学、地质学、比较宗教学和音乐等不同领域。他还发明了第一个磁钟,许多早期的自动机,以及扩音器。 这个可以说话的雕像是一个非常早期的聆听系统:它本质上是一个由巨大的螺旋管制成的麦克风,一边采集来自公共广场的对话声音,一边通过管道传达到在贵族私人房间内的雕像口中。正如基歇尔所写:

“这个雕像必须位于一个既定的位置,以便使螺旋形管的末端部分精确地对应于嘴的开口。以这种方式它将完美的,并且能够清楚地发出任何声音:事实上,雕像将能够用人或动物的声音持续说话:它会大笑或者冷笑;它会哭或者呻吟;有时它会发出令人吃惊的巨响。如果螺旋管的开口位于开放的公共空间,那么所有人们发出的音节都汇集于管道中,通过雕像的嘴重放而出。”

这一听力系统可以窃听广场上的日常对话,并将其传达给17世纪的意大利寡头政府。 基歇尔的谈话雕像是社会精英提取信息的早期形式——在街上说话的人将完全不知道他们的谈话内容正在被其他人利用,用以掌握权力、制造娱乐和财富。贵族家中的人们并不知道这么一个神奇的雕像是如何说话和传达各种信息的。雕像的目的就是模糊整个系统的运作方式:人们只能看到一个优雅的雕像。即使在这一早期阶段,听力系统就已经是为了权力、阶级和秘密服务的。但基歇尔系统的基础设施非常昂贵——仅限少数人使用。这个问题至今仍然存在,构建此类系统的全部资源到底需要多少?所以,我们需要了解底层基础设施的物质性。


阿塔纳修斯·基歇尔尕,“对讲雕像”,1673





杰西·帕瑞卡(Jussi Parikka)在其《媒体地质学》一书中提出,我们不要试图从马歇尔·麦克卢汉的观点来看待媒体——媒体是人类感官的延伸 ——取而代之的我们要把媒体看成地球的延伸。媒体技术应该在地质过程的背景下理解,从创造和转化过程,到构建媒体的自然元素的运动。反复思考媒体和技术作为一种地质过程,能让我们仔细考虑当前技术所必需的不可再生资源的深度消耗。AI系统扩展网络中的每个对象,从路由器到电池到麦克风,都是使用数十亿年才能生成的元素构建的。从深度时间的角度来看,我们正在“采掘”地球的历史以服务于技术时间的一瞬,建造使用不超过几年的设备。例如,消费者技术协会指出智能手机的平均寿命为4.7年。这种过时淘汰的循环促使人们购买更多设备,驱动更多的商业利润,同时提高对不可持续的采掘操作的奖励。元素和物质材料来自于一个非常缓慢的元素发展过程,却经历了一个极其迅速的挖掘、冶炼、混合和物流运输的阶段——在这种转化之中穿越数千公里。地质过程标志着这一阶段的开始和结束,从矿石开采到电子垃圾堆中的材料沉积。出于这个原因,我们的地图以地壳开始也以其结束。然而,我们所描绘的所有转化和运动只是最基础的解剖轮廓:在这些联系之下存在更多层次分形状态的供应链、人类资源和自然资源的开发、企业和地缘政治力量的集中以及持续的能源消耗。





在资源、劳动力和数据挖掘之间建立的联系让我们不可避免地回到传统的剥削框架。但是这些系统是如何产生价值的?可以在克里斯蒂安·富克斯(Christian Fuchs)和其他作者检视和定义数字劳动的文章中找到一个有用的概念工具。数字劳动的概念最初与不同形式的非物质劳动形式有关,它的出现先于人工智能之类的设备和复杂系统。数字劳动——建立和维护数字系统堆栈(stack)的工作——绝不是一种虚拟或一时热度的工作,而是深入地体现在不同的活动中。其涉及的范围极广:矿山中的契约劳工开采构成信息技术物理基础的矿物;中国工厂中严格控制、但有时却很危险的硬件制造和装配过程;发展中国家利用外包工人来标记AI训练数据集;以及非正式的体力劳动者清理有毒废物堆。这些过程创造并积累了新的财富和权力,但却集中在一个非常小的社会层面。

 
Product of labour(Subject - object): 劳动产品(主体 - 客体)
Labor power(Subject): 劳动力(主体)

Means of production: 生产资料

Marx’s dialectic of subject and object in economy: 马克思对于经济的主客体辩证法






这个价值攫取和生产的三角代表了我们地图中的一个基本要素,从地质过程中的诞生,到以消费者AI产品的身份为生,最后在电子垃圾堆中消亡。就像富克斯的作品一样,我们的三角不是孤立的,而是在生产过程中相互连接着。它们形成了一种循环流动,其中工作的产物被转化为资源,资源被转化为产品,再被转化为资源等等。每个三角代表生产过程中的一个阶段。虽然在地图上看起来像是一个线性转换的路径,但采用这种独特的视觉隐喻更能代表当前采掘主义的复杂性:即谢尔宾斯基三角的分形结构。

线性的展示不能显示出生产和开发的每一步都是包含前面阶段的。如果我们通过分形视觉结构来看待生产和开发系统,最小的三角将代表自然资源和劳动力,即矿工作为劳动力,矿石作为产品。下一个更大的三角形包括金属加工,再下一个将代表制造零件的过程等等。我们地图中最终的三角形,即亚马逊Echo单元本身的生产和制造,包括所有层次的剥削——从亚马逊公司的底部到顶部,即杰夫·贝索斯(Jeff Bezos)以亚马逊首席执行官身份所扮演的角色。像古代埃及的法老一样,他站在最大的人工智能价值采掘金字塔的顶端。


谢尔宾斯基三角或者谢尔宾斯基分形结构



十一

回到这种可视化——马克思生产三角的一种变形——的基本要素,每个三角在创造利润时产生了剩余价值。如果我们查看一台设备生产过程中每项活动的平均收入规模(在地图左侧显示),我们会发现收入的巨大差异。根据国际特赦组织(Amnesty International)的研究,在挖掘钴元素(也被用于制作16个跨国品牌的锂电池)期间,工人每天可获得相当于1美元的工资,却要在危及生命和健康的环境下工作并常常受到暴力、勒索、和恐吓。国际特赦组织还调查到有年仅7岁的儿童在矿场工作。相比之下,亚马逊首席执行官杰夫·贝索斯位于该分形金字塔的顶端,根据彭博亿万富翁指数(Bloomberg Billionaires Index),他在2018年前五个月平均每天收入2. 75亿美元。在刚果的一个矿场工作的孩子需要超过70万年的不间断工作才能获得与贝索斯一天收入相等的金额。

这张地图上显示的许多三角形都隐藏着劳动剥削和不人道工作条件的故事。元素转换对生态的影响和收入差距只是表现这种深层的系统性不平等的方式之一。我们也研究了不同形式的“黑箱”,即不透明的算法过程,但这张地图还指向另一种形式的不透明:创造、训练和操作像亚马逊Echo这样的设备的过程,本身就是一种黑匣子,考虑到世界各地的多层承包商、分销商和下游物流合作伙伴,我们很难对其进行检查和跟踪。正如马克·格雷厄姆(Mark Graham)所写,“当代资本主义掩盖了消费者对大多数商品的历史和地理位置的了解。消费者通常只能在当下的空间和时间中看到商品,很少有机会通过生产链向过去凝视,以获得有关生产、转化和分销地点的知识。”

调查和跟踪当代生产链流程难度的一个例证是,英特尔用了四年多的时间来充分了解其供应线,以确保其微处理器产品中没有来自刚果的钽元素。作为半导体芯片制造商,英特尔为苹果提供处理器。为了做到这一点,英特尔拥有自己的多层供应链,在100多个国家拥有超过19,000家供应商,为其工厂、物流和包装服务直接提供生产流程、工具和机器。一家领先的科技公司花了整整四年的时间才了解自己的供应链,揭示了这个过程从内部掌握的难度,更不用说外部研究人员、记者和学者了。总部位于荷兰的科技公司飞利浦也声称其正在努力使其供应链“无冲突”。比如,飞利浦拥有数万家不同的供应商,每家供应商都为其制造流程提供不同的组件。这些供应商又与成千上万的组件制造商连接在一起,这些制造商从数百家炼油厂购买材料,这些炼油厂从不同的冶炼厂购买原料,这些冶炼厂由未知数量的贸易商提供,直接涉及合法和非法采矿业务。在《权利的元素》中,大卫·S·亚伯拉罕(David S. Abraham)描述了全球电子供应链中稀有金属交易商的无形网络:“从矿山到笔记本电脑的稀有金属网络通过交易商、处理器和元件制造商的模糊网络传播。交易员不仅仅是购买和销售稀有金属的中间商:他们也在调控信息,是金属工厂和我们笔记本电脑组件之间的隐藏环节。” 根据计算机制造公司戴尔的说法,金属供应链的复杂性带来了几乎无法克服的挑战。这些矿物的开采早在最终产品组装之前就开始了,这使得追踪矿物的来源非常困难。另外,许多矿物质与再生金属一起冶炼,几乎不可能将矿物质追溯到其来源。因此,我们看到理解完整供应链的尝试是一项实实在在庞大的任务:揭示了21世纪全球技术产品生产的复杂性。



十二

供应链总是一层叠一层,处在一个不断蔓延的网络中。苹果的供应商体系揭示了在它们的设备中:成千上万个单独制作的零件被嵌入其中,而这些零件又是由上百家不同的公司所供应。为了使每一个零件都能够出现在装配线上,并由富士康工厂的工人来组装,不同的零件需要从30个国家的750个供应商地点运输而来。这形成了一种超级复杂结构,供应链中嵌套着供应链,忽然猛增的几万个碎片化供应商,几十万吨船载原料和数十万在产品装配之前就牵涉在过程中的工人。

如果将这一过程想象为一个材料、零件和产品流动所凭借的全球性、泛大陆网络,我们会看到它与全球信息网络的相似之处。信息网络中,单个因特网包(Internet Pack)会行进至亚马逊Echo系统之中,我们在此可以将其想象成一个单独的货物集装箱。全球物流和生产如今令人炫目的壮观景象离不开这种简单又标准化的金属物发明。由于标准化集装箱的存在,使得现代船运工业可以将地球模型化为一个大型、单一的工厂,从而迎来了爆炸性的发展。2017年,海运交易的集装箱运货船总容量接近250,000,000固定负载,主要来自于丹麦马士基集团、瑞士地中海航运公司,以及法国达飞海运集团等船运巨头公司,其中每一家都拥有数百艘集装箱货船。对于这些商业投资者来说,船运是一种能够穿行于全球工厂的复杂网络中相对廉价的方式,但它掩盖了更大的外部成本。近些年来,货运船每年排放的二氧化碳占全球排放量的3.1%,比整个德国的排放量还要高。为了最小化他们的内部成本,大部分集装箱船运公司大量使用非常劣质的燃料,导致空气中含硫量上升,也带来其它有毒物。据估计,一艘集装箱货运船的污染排放量与5千万辆汽车相当,而这一工业所带来的相关问题也间接导致了全世界每年六万人的死亡。甚至像世界航运工会这样与产业比较亲近的组织也承认,每年都有数千个集装箱在海上丢失,沉入海底或者四处漂流。有些集装箱装有毒性物质,可能会渗入海洋之中。一般来说,劳工们会在海上呆9到10个月的时间,常常要长时间轮班,而且与外界没有联系。菲律宾的劳工在全球货运劳力中的占比超过三分之一。全球物流中最要紧的成本来自于大气,海洋生态系统和其中所包裹的一切,以及低薪劳工。
 

货物集装箱



十三

人类科技与日俱增的复杂化和微型化取决于一种技术过程,令人感到奇异的是,这种过程与早期中世纪炼金术的目标相呼应。中世纪炼金术试图将基础金属转化为“高贵”的种类,而今天的研究者使用稀土金属来增强其它矿物质的性能。地球一共有17种稀土元素,深嵌于笔记本电脑和智能手机之中,以使它们的体积更小,重量更轻。在色彩显示器,扬声器,相机镜头,GPS系统,可充电电池,硬件驱动以及其它很多元件之中,稀土都会起到一定的作用。从光纤电缆,移动通信塔中的信号放大到卫星和GPS技术中都有稀土元素的存在,它是通讯系统的关键组成。但是我们很难弄清楚这些矿物质精准的用处和配置。正如中世纪炼金术士的研究隐藏在密码及含义模糊的符号之中,当代社会对于这些稀土在设备中的使用过程也被保护在保密协议和交易机密之中。


稀土具有非常独特的电子,光学和磁学性能,人类至今无法找到其它材料或者合成替代品可以与之相提并论。虽然它们被称为“稀土金属”,但有些元素在地壳中的含量是相对丰富的,只是开采起来非常昂贵,并且带来高度污染。大卫·亚伯拉罕(David Abraham)描述了广泛用于高科技设备的元素镝(Dy)和铽(Tb)在中国江西的开采情况。他写到:“被开采的粘土中只有0.2%含有宝贵的稀土元素。这意味着被挖出的稀土矿土中,99.8%都被当成废料弃置,它们被称为“尾矿”(tailings),丢弃回了山川溪流之中,”又产生了比如铵这样新的污染物。为了精炼一吨稀土元素,“中国稀土协会预估这一过程将会产生75000升酸性水,以及一吨放射性残渣。”此外,开采和精炼活动消耗大量的水,排放大量二氧化碳。2009年,中国生产了全世界稀土供应量的95%,据估计,白云鄂博单一矿区就占到世界存量的70%。


钪钇镧铈
镨钕钷钐
铕钆铽镝
钬铒铥镱

稀土元素



十四

一张关于印度尼西亚邦加小岛的卫星照片讲述了半导体生产中人类与环境付出的代价。在这个小岛上,大多数“非正式”矿工都是站在临时的浮桥上,用竹竿刮过海底,然后潜入水下,用巨大的类似真空的管子从海底表面吸取锡。据《卫报》的一份调查报告所述,“开采锡矿的利润非常可观,但这是一种毁灭性贸易,会破坏小岛的景观,推倒农场和森林,消灭渔业资源和珊瑚礁,损害漂亮的棕榈海滩所带来的旅游业。从空中观察这一破坏最为清晰,可以看到在团团簇拥的茂密树林周围,有一些巨大的橙色长条形贫瘠土地。这一景观不是由采矿直接主导的,而是由麻点一样的坟墓所形成,其中埋葬着过去几个世纪因为采锡而亡的矿工尸体。

邦加和勿里洞这两个小岛,产出了印度尼西亚90%的锡,而印度尼西亚是世界上第二大的锡出口国。印尼的国有锡业公司天马集团,直接向三星等公司供货,也向焊材制造商晟楠和升茂供货,这两家公司转而将产品供应给索尼、LG和富士康。



十五

在亚马逊配送中心,大量产品按照计算机安排的顺序被放置在几百万个货架上。这个空间中的每个物品位置都由复杂的数学函数精确计算得出,这些函数处理订单信息,并创建产品之间的相对关系。这一算法行为的目标是为了优化仓库中共同协作的人机运动。工人凭借电子手环的引导和帮助,穿行于飞机库大小的仓库中,这些仓库塞满了依据不透明的算法规则而排列的物体。

在亚马逊成千上万的公开专利之中,有一份编号为9280157的美国专利令人吃惊地表现出了对于工人的异化,这是人机关系中的至暗一刻。这一专利描绘了为工人设计的金属牢笼,上面装有不同的机械控制附件,由一个原本用来挪动商品货架的驱动系统来控制它的移动。

我们从剖析图的研究中一次又一次看到,反乌托邦的未来建立在当下和历史反乌托邦体制所达成的一种不均匀分布之中,并散布在一系列现代技术设备的生产链条之中。隐匿在价值掘采金字塔顶端的少数人拥有令人惊讶的财富和舒适的生活。但是这一金字塔的主体却来自矿井的黑暗隧道,放射性的废弃物湖泊,弃置的运输箱,以及大公司的工厂宿舍。


编号为20150066283 A1的亚马逊持有专利



十六

在19世纪末,一种名为电木的东南亚特色树木成为了技术爆发的核心。这些树主要来自马来西亚,能够产出一种名为杜仲橡胶的奶白色天然乳胶。1848年,英国科学家迈克尔·法拉第(Michael Faraday)在《哲学杂志》上发表了使用这种材料作为电绝缘体的研究,之后杜仲橡胶迅速风靡了工业世界。它被认为能够解决绝缘电报线经受海底环境的问题。随着全球海底商业的增长,电木的需求也不断增加。历史学家约翰塔利(John Tully)描述了本地马来人,华人以及迪雅克人如何为了微薄的薪水而从事十分危险的树木砍伐和收集乳胶的工作。处理过的乳胶经新加坡的贸易市场销往英国,随后被加工成很多产品,包括绵长无尽的海底电缆护套。一株成熟的电木可以产出200克乳胶。但是,1857年制成的第一条横跨大西洋的电缆大约3000公里长,重达2000吨 – 需要大约250吨的杜仲橡胶。为了生产1吨这种材料,就需要90万根树干。马来西亚和新加坡的丛林被砍光了,等到1880年代早期,电木已经灭绝了。英国为了在最后关头抢救一下自己的供应链,于1883年通过了一项停止采集乳胶的禁令,但这种树木已经绝种了。

维多利亚时代杜仲橡胶的环境灾难,从全球信息社会的历史源头上说明了技术与其物质性、周围环境和不同形式的采掘之间形成了层叠交错的关系。正如维多利亚时期的人为了早期的电缆而使得生态灾难突然降临,现在的稀土挖掘和全球供应链体系也会危及当代脆弱的生态平衡。从建设当代网络社会所需的技术材料到大规模基础设施中传输、分析和储存数据所需的能源再到基础设施的物质材料:这些深入的联系和付出的代价比今天的大公司对于人工智能的想象要更为重要,历史也更为久远。


电木



十七

大规模人工智能极大地消耗能源。但这些被消耗物质的详细情况在大众想象中仍然是模糊的。想要精确得到关于云端计算服务的能源消耗量是非常困难的。据一份绿色和平组织报告所述,“在产业透明化的推进过程中,最大且唯一的障碍是亚马逊网络服务(AWS)。世界上最大的云计算公司进行大规模操作的能源足迹几乎是完全不透明的。在全球云供应商中,只有亚马逊网络服务依然拒绝告知公众其公司业务对于能源消耗和环境的基本影响。”

我们作为人类主体,与技术平台每次的交互中几乎都是可见的。我们总是可以被追踪,被量化,被分析,以及被商品化。但与用户的可见性相反,这些联网设备的生命周期细节,包括出生,活着和死亡各个阶段的情况,都是模糊的。随着Echo这种依赖于中心化人工智能基础设施的设备出现,这些细节就更加的不为人知。即使消费者逐渐熟悉了卧室里的一台小型硬件设备,或者是一个手机应用,或者是一台半自动驾驶的汽车,但是真正的事件过程是在机器学习系统中完成的,一般来说,用户与该系统距离很远,而且全然无法感知。在很多情况下,透明度并没有多大意义 – 如果用户缺乏真实的选择空间,以及如果企业不能负起责任来,仅仅是透明度无法扭转目前这种权力的不对等。

机器学习系统的输出结果在多数情况下是不负责任且不受治理的,但它的输入数据常常成谜。对于一些漫不经心的旁观者来说,从没有像今天这样可以如此轻易地建造一个人工智能或者以机器学习为基础的系统。触手可得的开源工具,结合亚马逊(AWS)、微软(Azure),或者谷歌(Google Cloud)等云处理巨头所提供的可租借算力,正在推动一种错误的人工智能民主化想象。即使机器学习的工具不再开架售卖,而是变得更为易得,比如TensorFlow,它们鼓励你建立自己的系统,但这些系统的底层逻辑,以及训练数据集只有少数几家公司可以获取,并被他们掌控在手里。像Facebook之类的平台所进行的是动态数据采集,用户的行为,嗓音,标记的图片和视频或者医疗数据都会被用来训练神经网络。在这个采掘主义盛行的时代,数据真正的价值被少数金字塔顶端的人所控制并掘取。



十八

当大规模数据集被用来训练人工智能系统,其中涉及的个人图像和视频通常会被打上标注。这一标注过程如何将数据原有的意义废置并冻结,以及更进一步,如何利用低薪数据标注劳工来完成,需要被全面阐释。

1770年,匈牙利发明家沃尔夫冈·冯·肯佩伦(Wolfgang von Kempelen)建造了一种下棋机器,被称作土耳其机械人(Mechanical Turk)。他的目的,部分是为了获得奥地利女皇玛利亚·特蕾莎(Empress Maria Theresa)的关注。这个机器能够与人类对手下棋,而且在持续了将近九年的欧洲和美洲巡展期间赢得了大部分的比赛,十分令人惊讶。但是土耳其机械人只是一种障眼法,一位棋手大师藏在机器里面并操作机器下棋。160年后,亚马逊网开始推广基于微支付的众筹平台,同样以土耳其机械人为名。据艾汉·埃蒂斯(Ayhan Aytes)所言,亚马逊曾试图通过人工智能程序来寻找零售网站上的雷同产品页,但这一努力失败了,随后亚马逊便着手开发土耳其机械人。在一系列徒劳而昂贵的尝试之后,项目工程师们转而使用人类劳工,这些劳工躲在计算机的后面工作,身处流水线式的基于Web系统之中。亚马逊的土耳其机械人数字车间(Mechanical Turk digital workshop)效仿人工智能系统,对机器学习过程进行检视,评估以及纠正,但它通过人脑来完成这一过程。对于亚马逊土耳其机械人的用户来说,应用程序完成任务所使用的,似乎是高级人工智能系统。但它更接近于一种“仿冒人工智能”,通过远程且分散的廉价数据标注劳工来帮助客户完成他们的生意目标。正如埃蒂斯所观察,“在这两个案例中(1770年的土耳其机械人以及亚马逊服务的当代版本),工人激活了这些骗人的诡计,但他们的工作被机器的奇观所掩盖。”

这种隐形的、被掩盖的劳动,无论是外包还是众包,藏匿于界面之后,伪装在算法过程之中,现在变得极其常见,尤其是在为训练神经网络而进行的数据标注过程,这种标注往往需要检视数千小时的数据档案。有时这种劳动完全是无偿的,比如谷歌的验证码服务系统(reCAPTCHA)。很多人都经历过一种自相矛盾的体验,为了证明你不是人工主体,而被迫去无偿训练谷歌的图像识别人工智能系统,选择那些含有街道号码,或者汽车、房子的多个方块图片。

正如我们在系统中反复见到的一样,当代的人工智能形式根本就没有脱离人。比如说矿井工人艰难的体力劳动,比如说装配线上不断重复的工人劳动,比如说分销中心的自动控制化劳动和雇佣全世界外包程序员的认知经济血汗工厂,再比如说机械土耳其人中的低薪众包劳力,或者无偿进行非物质劳动的用户。无论在上述哪个层级,当代技术都深深植根于对于人类身体的开发,并且以人类身体为基础来运作。


土耳其机械人



十九

在他的一段式短故事“关于科学的精确性”中,豪尔赫·路易斯·博尔赫斯(Jorge Luis Borges )为我们呈现了一个虚构的帝国,该国制图科学十分发达,测量极其精确,以至于绘制一张帝国地图的尺寸与帝国实际大小相当。

“…在那样一个帝国,制图的艺术如此完美,以至于单单一个省份的地图就会覆盖一座城市,而帝国的地图,则会占据整个省。随着时间流逝,人们对于那些不合理的地图不再满意,于是制图工会制作了一张帝国面积大小的地图,与真实世界每个点的情况保持一致”。他们的后代对于制图学的研究不像祖先那么热衷,认为这些巨大的地图是无用的,而且不无冷漠地任凭地图被风吹日晒。在西边的沙漠里,直到今天,仍然有破烂的地图遗迹,动物和乞丐在上面栖居;而在所有的土地上,地理学不复存在,没有任何的遗留。

现在的机器学习方法有一种对于描绘世界的渴望,一种对于真实视觉、声音和识别机制可以全面量化的渴望。从宇宙学模型到使用最细微的人脸肌肉运动转译人类的情感世界,所有的一切都变成了量化的对象。让·弗朗索瓦·利奥塔(Jean-François Lyotard)引入了术语“趋近永恒”(affinity to infinity)来描述当代艺术、技术科学和资本主义都同样渴望将自己的边界推向一种潜在的永恒视界。19世纪下半叶,随着社会重心聚焦于基础设施的建设和不均匀的工业化转向,为一小撮垄断自然资源开采和生产过程的工业巨头创造了巨量的财富。

如今新的永恒视界便是数据挖掘,机器学习以及信息重组,它们均由人机结合而成的人工智能系统来完成。这些领域被几个全球大公司所主宰,它们设定了资本积累、人类资源及星球能源开采的机制并建造了相应的基础设施。

这种对于新资源开采和新认知能力挖掘的渴望驱动人类去探寻前所未有的深层数据,用来量化人类的心智,包括有意识的和无意识的,私人的和公共的,另类的和大众的。在这种方式之中,出现了注意力经济中的多重认知经济,审查经济,声誉经济,以及情感经济,还出现了以加密货币为代表的信任和证明的量化及商品化。

量化的过程越来越深入人类情感,认知以及物理世界。训练集存在的目的是为了侦测情绪,寻找家族相似性(family resemblance),追踪不断变老的个体,以及识别坐下,挥手,举杯,或者哭泣等人类动作。每一形式的生物数据–包括法证、生物计量,社会经济,以及心理测量–都会被采集,并录入数据库以便训练人工智能。这一量化过程的基础通常非常局限:在类似于元视觉行为(AVA)这样的数据集中,女性主要位于“与孩子玩耍”的动作类别,而男人主要位于“踢人”的类别。人工智能系统的训练集声称可以深入到日常生活中极其细微的特征,但实际上却一直重复最陈腐和局限的社会模式,重新印刻一种人类过去的范式,将其投射到未来。


量化自然



二十

“对于生物多样性和知识的“圈地”是自殖民主义兴起以来所进行的一系列“圈地”活动中的最后一步。陆地和森林是最早被“圈地”的资源,它们从社会公共资源被转化为商品。之后,人类通过兴建大坝,地下水开采以及私有化对水资源进行了“圈地”。现在轮到了生物多样性和知识,它们被“圈地”的方式是通过知识产权(IPRs),”范达娜·希瓦(Vandana Shiva)解释到。在希瓦的话中,“工业发展势必破坏资源的公共性,从而能够获得工业原材料所需的自然资源。而维持生命的系统可以被分享,但却不能作为私人财产被占有或者以个人利益为目的被开采。因此,社会公共资源不得不被私有化,人们赖以为生的基础也不得不被占用,以满足工业前进的动力和资本积累”

虽然希瓦的言论只涉及了知识产权对自然的“圈地”,但同样的过程也正发生在机器学习所带来的问题之中–加剧对于自然的量化。围绕人工智能的新淘金热正在圈定人类知识,感受和行为的不同领域,以便于掌控它们,将其私有化。2015年11月,当深思技术有限公司(Deep Mind)获取了皇家自由医院160万可识别病人的健康资料时,我们见证了一种特殊形式的私有化:对于知识价值的采掘。数据集可能仍然可以为公众享有,但数据的元价值–它所创造的模型–是被私人所拥有的。虽然这样做的原因也有很多是为了促进公众健康,但如果其代价是对于公共医疗服务偷偷摸摸的私有化,那么对于社会来说是具有极大风险的。由此可以想象这样一个未来,在公共系统中的本地专业人工劳力持续扩张,有时会被私有大公司的中心化人工智能系统所取代,而它们使用公众数据来为少数人创造巨额财富。

 
公司边界



二十一

在21世纪的今天,我们看到了一种全新形式的采掘主义已经来临:它抵达了生物圈的最深处,以及人类认知和情感的最底层。通过机器学习所定义的对于人类生活的假设非常狭隘、范式化且遍布错误。但是它们却正在将这些假设印刻并建立在新的世界之中,而且会对人类的机会、财富以及知识的分配产生越来越大的影响。

与亚马逊Echo交互所需要的堆栈(Stack)远远不止由数字模型、硬件、服务器以及网络所组成的多重“技术堆栈”层级。全堆栈的范围可拓展到资本,劳动以及自然,而且要求每一种资源大量的参与。这些系统的真正成本–社会,环境,经济,以及政治–总是隐性的,而且可能会一直不被人所察觉。

我们提供这份地图以及这篇文章是为了开启一种对于系统性采掘的广泛视野。建立人工智能系统所需要的资源规模太过复杂,易被知识产权法所掩盖,且深陷于逻辑复杂性的泥潭之中,致使人们在当下无法理解其中的问题。但实际上你每次对着卧室里的小圆柱体发出一个简单的语音指令“艾莉克莎,现在几点了?”,你都在利用这一系统。而这个过程循环往复,不停地发生。







马蒂欧·帕斯克奈利是德国卡尔斯鲁厄艺术设计学院媒体哲学方向的教授,他在该校也带领KIM研究项目(人工智能与媒体哲学)。他目前正在进行关于作为劳动分支的人工智能系谱学的新书写作,书名是《大师之眼:作为计算和认知的资本》。
本文经由作者授权转载,原文地址:
https://www.e-flux.com/journal/101/273221/three-thousand-years-of-algorithmic-rituals-the-emergence-of-ai-from-the-computation-of-space/

© 2019 e-flux and the author