We just officially ended the inaugural Computational Category Theory workshop at the National Institute for Standards and Technology (NIST). During the workshop the participants had five discussions, on
- algorithms for category theory,
- data structures for category theory,
- applied category theory (ACT),
- building the ACT community,
- and open problems in the field.
Below, I’ve written up a partial summary of these discussions.
Back in January, Yiannis (Vlassopoulos) and I were talking about “quadratic relations” and higher concepts in language, for example the analogy between “(king, queen)” and “(man, woman)”. Deep neural networks can learn such relations from a set of natural language texts, called a corpus. But there are other ways of learning such relations:
|Representation of corpus
||How to learn / compute
||standard algorithms, e.g. neural nets
||computational category theory
|formal concept lattice
There are some very nice connections between all five representations. In a way, they’re all struggling to get away from the raw, syntactic, “1-dimensional” data of word co-location to something higher-order, something semantic. (For example, “royal + woman = queen” is semantic; “royal + woman = royal woman” is not.) I’d like to tell a bit of that story here.
After seeing David Spivak’s talk on operads for design at FMCS 2015, I immediately thought of Brooks’ subsumption architecture. The subsumption architecture was one of the first formalisms for programming mobile robots—simple, insect-like robots capable of feeling their way around without needing to plan or learn. Operads, on the other hand, are certain objects in category theory used to model “modularity”, e.g. situations where multiple things of a sort can be combined to form a single thing of the same sort.
I’d like to formalize subsumption using operads.
But why would anyone want to formalize an derelict robotics architecture with high-falutin’ mathematics?
The answer is simple. It’s not that subsumption on its own is important (though it is) or that it requires formalizing (though it does). What I’d really like to understand is how operads give domain-specific languages (and probably much more) and whether categories are the right way to pose problems that involve combining and stacking many such DSLs—think of a robot that can move, plan, and learn all at the same time—which, for lack of a better term, I will call hard integration problems.
(The rest of this post is currently in process! I will come back throughout the fall and update it.)
Back when I was a rube just starting to learn algebraic topology, I started thinking about a unifying platform for mathematics research, a sort of dynamical Wikipedia for math that would converge, given good data, to some global “truth”. (My model: unification programs in theoretical and experimental physics.) The reason was simple—what I really wanted was a unifying platform for AI research, but AI was way, way too hard. I didn’t have a formal language, I didn’t have a type system or a consistent ontology between experiments, I didn’t have good correspondences or representation theorems between different branches of AI, and I certainly didn’t have category theory. Instinctively, I felt it would be easier to start with math. In my gut, I felt that any kind of “unifying” platform had to start with math.
Recently I met some people who have also been thinking about different variants of a “Wikipedia for math” and, more generally, about tools for mathematicians like visualizations, databases, and proof assistants. People are coming together; a context is emerging; it feels like the time is ripe for something good! So I thought I’d dust off my old notes and see if I can build some momentum around these ideas.
- In part 1, examples, examples, examples. I will discuss type systems for wikis, Florian Rabe’s “module system for mathematical theories“, Carette and O’Connor’s work on theory presentation combinators, and the pro/con of a scalable “library” of mathematics.
- In part 2, I’d like to understand what kind of theoretical foundation would be needed for an attack on mathematical pragmatics (a.k.a. “real mathematics“) and check whether homotopy type theory could be a good candidate.
- In part 3, I will talk about mathematical experiments (everything we love about examples, done fancier!), their relationship with “data”, and what they can do for the working mathematician.
Recently, I’ve been working on a project to apply persistent homology to neural spike train data, with the particular goal of seeing whether this technique can reveal “low frequency” relationships and co-firing patterns that standard dimensionality reduction methods like PCA have been throwing away. For example, some neurons fire at a very Hz, around ~75-100 Hz, while in fact most neurons fire at ~10-30 Hz in response to some stimulus. The loud, shouty neurons are drowning out the quiet, contemplative ones! What do the quiet neurons know? More to the point, how do I get it out of them? Continue reading
Hello! I’m a student / mathematician / computer scientist doing research on the intersections between geometry and artificial intelligence. I am currently doing a PhD at Oxford; previously, I was the entrepreneurial lead for Categorical Informatics, a math+data company spun out of MIT.
To contact me, send me an email at joshua dot z dot tan at gmail dot com (remember the “z” in the middle, otherwise you’ll get someone different!).
Say that we are given a single room in Borges’ library, and that we would like to say something about what those books are about—perhaps we can cluster them by subject, determine a common theme, or state that the selection is rigorously random. One way to start would be to scan every book in the room, represent each one as a long string (an average book has about 500,000 characters, though each book in Borges’ library has 410 pages x 40 lines x 80 characters per line = 1,312,000 characters), then perform some sort of data analysis.
In this case the number of books in each room (32 per bookshelf x 5 shelves per wall x 4 walls = 640 books) is far less than the length in characters of each book, i.e. the dimension of the book if we represent it in vector form. We would like to pare down this representation so that we can analyze and compare just the most relevant features of these books, i.e. those that suggest their subjects, their settings, the language they are written in, and so on.
Typically, we simplify our representations by “throwing out the unimportant dimensions”, and the most typical way to do this to keep just the two to three dimensions that have the largest singular value. On a well-structured data set, singular value decomposition, aka PCA, might leave only 2-3 dimensions which together account for over 90% of the variance… but clearly this approach will never work on books so long as we represent them as huge strings of characters. No single character can say anything about the subject or classification of an entire book.
Another way of simplifying data is to look at the data in a qualitative way: by studying its shape or, more precisely, its connectedness. This is the idea behind persistent homology.
To be continued…