in Uncategorized

How to have a conversation about AI

I tend to have a lot of “hot-topic” dinner conversations with people about AI: will robots take our jobs, is software intelligence going to take over the world, and the near-term impacts of big data on everything from science to ecology to law. And it’s not just me: consider all the recent symposia about “the end of work“, the “AI race“, “how to stay human in a robot society“, etc.

While not necessarily shallow, these conversations are invariably speculative. I can’t really respond to people’s concerns about AI because the questions they ask don’t connect with the technical concepts that I study. Talking about AI, for most non-technical people, is a proxy for talking about the place of people in society, present and future. They characterize AI by a set of external variables: how it will displace jobs, how it will make things cheaper and faster in their life/work, how it could make society more or less fair. These are observations one can make without knowing anything about AI, which is why I call them “external”. On the other hand, I study internal variables (a.k.a. technological variables) like error bounds on particular learning algorithms, logical programming, and a slew of engineering problems like motion planning, natural language, and domain generalization. Studying these things does not make me obviously qualified to address social-sciency concerns about the place of people. For the same reason, I’m quite skeptical of AI “experts” when they prognosticate about the impact of AI on society.

Still, there must be something that the internal variables in AI can say about the external variables—but how to say it? Relating the two sets of variables would clarify non-technical people’s concerns to technical people, measure technical developments in terms of non-technical outcomes, and suggest “internal” solutions to their “external” concerns (or prove the absence of such solutions). And maybe, just maybe, it would help all of us have better conversations about AI.

Let’s talk about jobs

Last month, I had tea with Emma and Antone Martinho-Truswell in their cozy apartment above the Waynflete. The conversation was a bit unusual: Emma and Antone had brought together a small group of people together, convinced us to sit and read a paper for an hour in total silence (compensate with snacks, tea, and port), then invited us to talk about the paper for an hour. The paper we read was “Why Are There Still So Many Jobs? The History and Future of Workplace Automation” by David H. Autor. (We also read a short article from Foreign Affairs: “Will Humans Go the Way of the Horse?“.)

In his paper, Autor raised two distinctions: between substitution and complementation (two ways that human labor can be replaced by new technology), and between environmental control and machine learning (two mechanisms by which AI in particular will replace more jobs). Substitution and complementation are external variables, germane to economics. Environmental control and machine learning are two technical research areas internal to AI. As Autor makes clear, substitution/complementation and control/learning are broadly correlated, in that control and learning are particular AI technologies that will substitute for and complement human labor.

My question: can a better understanding of environmental control and machine learning help us understand and predict substitution and complementation?

Substitution occurs when a machine completely or substantially replaces a job performed by human labor, such that the human is no longer needed to do it. For example, an automated car substitutes for an Uber driver. Complementation occurs when a machine replaces a portion of tasks performed by a human, allowing that human to not only keep his/her job, but to focus and excel in other components of the job. For example, a digital secretary complements a surgeon, who might have had to do a lot of desk-work herself. Here’s Autor:

Specifically, I see two distinct paths that engineering and computer science can seek to traverse to automate tasks for which we “do not know the rules”: environmental control and machine learning. The first path circumvents Polanyi’s paradox by regularizing the environment, so that comparatively inflexible machines can function semi-autonomously. The second approach inverts Polanyi’s paradox: rather than teach machines rules that we do not understand, engineers develop machines that attempt to infer tacit rules from context, abundant data, and applied statistics.

Here’s a more formal model of substitution and complementation. Imagine that we have two diagrams of all the processes that happen inside that company, one from before purchasing the machine and one from after.

[DIAGRAMS]

A process is any box or set of boxes as depicted above. So the company itself is modeled as a process, as is any sub-diagram of the company process diagram. We denote the set of all processes (whether or not they exist in the company) by Processes. A replacement is a change in the process diagram of a company in which I take any process (whether human or machine) and replace it with another process. We can think of a replacement as a very simple kind of “rewrite” or map r : Processes -> Processes.

Firing a person and putting in a robot in his place can be understood as a kind of replacement, but so is throwing out the old telephone and replacing it with a VoIP system.

[DIAGRAMS]

A job is a sub-diagram of the company process diagram (so all jobs are processes, but not all processes are jobs). We denote the space of all jobs in the company by Jobs. Remember that we ultimately care about whether jobs are human-jobs or AI-jobs, i.e. the data of (job, who-does-it-now). This data can also be thought of as a function j : Jobs -> {Human, AI}.

[DIAGRAM]

A substitution is a replacement where a “human” job is completely replaced by some non-human process or set of processes.

[DIAGRAM]

A complementation is a replacement where a “human” job is partly replaced by some non-human process or set of processes.

[DIAGRAM]

A series of replacements is like a recipe for automating your company.

We define S, the space of all substitutions, and C, the space of all complementations, by

S = \{ r | ?? \} and C = \{ r | ?? \}.

Modeling firms in terms of discrete processes forces us to think through what we really mean by substitution and complementation, but it also allows us to think more clearly about the role of technologies like environmental control and machine learning. In particular, we can identify each technology with a space of feasible replacements or, more generally, with a space of feasible processes.

[EXAMPLE, e.g. train automation?]

Some questions:

  1. Sometimes complementation carries positive benefits, i.e. the limited resources of the person may then be used to more effectively carry out her other tasks. Is this something the process model can handle?
  2. How practicable is it, really, to specify an entire space of feasible processes for a given technology? How do we even know what goes into that space?
  3. Well, even if we can’t, can we at least define the composition between (known) processes across spaces?
  4. Related to the above: how do we define “known” and “unknown” in this circumstances, especially for technologies that are themselves still under research?
  5. This blog post talks about things like robotics quotient: “a measure of the skills set that human workers have when working with augmenting technologies.” Measures like this analyze the relative competitiveness of human-machine teams; can we reproduce such measures in the above framework?
  6. More TBD.

Let’s talk about data

“Big data” is not a technical term; the “big”, especially, is a flatulent addition that makes me think of Walmart ads (“big savings!”) and shoddy real estate (“big bathrooms!”). Is 10,000 records “big”? Not for images, but maybe for clinical data. What about 10 million? Not for many time-series, but maybe for a census.

In reality, there are a lot of different properties about a data set that are more important and informative than “big”. Some of these properties are important if you want your machine learning algorithm to work; others arrive directly from database management. For example:

  • Dimensionality: how many features does your data have: 10? 100? 10^20?
  • Sparsity: even if you have a lot of features, are most entries in your data set 0 or null?
  • Time series: is there a time column?
  • Ordinal vs. cardinal: is there a natural (partial) order or ranking to the data, or is the data just categories?
  • Heterogeneity: are the features coming from many different sources / types of information?
  • Data integration: is the data already integrated / easily query-able inside a database, or is it sitting in various .xls, .csv, .pdf, and .doc files spread across an organization?
  • Data quality: are there lots of mistakes in the data (i.e. was collection by hand-entry) or noise from the collection process?
  • Data provenance: where did the data come from, and how has it been transformed into the state it is now?
  • Learnability? Courtesy Sanjeev Arora: there’s some seemingly magical property of non-random data sets—perhaps the prevalence of structure—that makes them conducive to being “learned”, especially by deep nets. We don’t know what this property is yet.

From my point of view, “properties of data” don’t quite correspond to technological variables like “machine learning” or “environmental control”. Data is not a technology. It’s a resource: something that gets consumed (or produced) by a process. Of course, digital data is also characterized by the fact that it can copied so that it doesn’t literally get consumed.

Not all processes consume or produce (computer-readable) data, e.g. a wood chipper consumes logs and produces wood chips. For reasons I won’t go into here, I’ll call any process that does consume and/or produce data a cyber-physical process.

My question: how does this notion of data, as the input or output to certain processes, interact with the idea that we can identify every technology with a space of feasible processes? And what do the properties of data listed above tell us about the technologies?

I discuss some aspects of this question in my work on indicator frameworks for smart cities. Some additional questions to answer:

  1. What else can we say about data; are there other ways we can characterize it “axiomatically”?
  2. What’s the point; what can this sort of analysis help us do?
  3. More TBD…

Let’s talk about the end of the world

In a recent paper with Jeff Ding, we discuss market systems and how they evolve with technology, especially technologies coming out of existing AI research. (We’re particularly interested in the market for AI goods and services, but that deserves its own story.) Jeff is interested in the implications of technology for policy and for the AI arms race; I’m there because I’m interested in the economic and modeling issues.

In particular, I’m interested in how we model markets.

In that paper, we talk about markets as (open dynamical) systems. Of course, many open systems can also be conceived as a process, and a market can be thought of as a process that takes in information from an incredible range of sources (sometimes in the form of digital data, but not always) and outputs a list of transactions, usually buy/sell transactions with a particular piece of metadata: the price. To be clear, we’re focusing on the information-processing aspect of a market, as opposed to the aspect that it denotes a physical means for exchanging goods/services.

My question: how can we apply the ideas above about technology and data to the market system? What other market variables can we model using this formalism?

In order to resolve this question, I think we will need to use the following observation: technology not only defines a space of feasible processes in the market; it is also a property of goods that are exchanged on the market. Or rather: since we are concerned only with the information-processing aspect of the market, technology not only defines a space of feasible processes in the market; it is also represented within the data that is input into the market.

Let’s do some research

A research program, for it to make sense, needs a focused series of questions. In this post, I’ve tried to re-structure the narrative I started out with—it’s hard to get technical and non-technical people to communicate with each other—into a series of questions, each associated to some specific application or example, with the hope of outlining some sort of research program… but I have a lot more to do! I would love to hear your comments or suggestions.

Write a Comment

Comment

  1. Hi Josh
    I’m not a scholar. Writing the comment I wish I could write makes me feel lost.
    Same as I felt 51 years ago, when I was a CERN summer student, after taking a first look at the Internals of the Control Data Corporation [CDC 6600 Mainframe] SCOPE [Supervisory COntrol of Program Execution] Operating System.
    I have just quoted what I wish I could comment … https://mastodon.uno/@prolocwt/104223995373686874
    I might be back, to keep going with what I’d like I could manage to say.
    Luigi