Twin-width II: small classes

. Abstract . The recently introduced twin-width of a graph G is the minimum integer d such that G has a d -contraction sequence , that is, a sequence of | V ( G ) | − 1 iterated vertex iden-tiﬁcations for which the overall maximum number of red edges incident to a single vertex is at most d , where a red edge appears between two sets of identiﬁed vertices if they are not homogeneous in G (not fully adjacent nor fully non-adjacent). We show that if a graph admits a d -contraction sequence, then it also has a linear-arity tree of f ( d ) -contractions, for some function f . Informally if we accept to worsen the twin-width bound, we can choose the next contraction from a set of Θ ( | V ( G ) | ) pairwise disjoint pairs of vertices. This has two main consequences. First it permits to show that every bounded twin-width class is small , i.e., has at most n ! c n graphs labeled by [ n ] , for some constant c . This uniﬁes and extends the same result for bounded treewidth graphs [Beineke and Pippert, JCT ’69], proper subclasses of permutations graphs [Marcus and Tardos, JCTA


Abstract
The recently introduced twin-width of a graph G is the minimum integer d such that G has a d-contraction sequence, that is, a sequence of |V (G)| − 1 iterated vertex identifications for which the overall maximum number of red edges incident to a single vertex is at most d, where a red edge appears between two sets of identified vertices if they are not homogeneous in G (not fully adjacent nor fully non-adjacent).We show that if a graph admits a d-contraction sequence, then it also has a linear-arity tree of f (d)-contractions, for some function f .Informally if we accept to worsen the twin-width bound, we can choose the next contraction from a set of Θ(|V (G)|) pairwise disjoint pairs of vertices.This has two main consequences.First it permits to show that every bounded twin-width class is small, i.e., has at most n!c n graphs labeled by [n], for some constant c.This unifies and extends the same result for bounded treewidth graphs [Beineke and Pippert, JCT '69], proper subclasses of permutations graphs [Marcus and Tardos, JCTA '04], and proper minor-free classes [Norine et al.,JCTB '06].It implies in turn that bounded-degree graphs, interval graphs, and unit disk graphs have unbounded twin-width.The second consequence is an O(log n)-adjacency labeling scheme for bounded twin-width graphs, confirming several cases of the implicit graph conjecture.
We then explore the small conjecture that, conversely, every small hereditary class has bounded twin-width.The conjecture passes many tests.Inspired by sorting networks of logarithmic depth, we show that log Θ(log log d) n-subdivisions of Kn (a small class when d is constant) have twin-width at most d.We obtain a rather sharp converse with a surprisingly direct proof: the log d+1 n-subdivision of Kn has twin-width at least d.Secondly graphs with bounded stack or queue number (also small classes) have bounded twin-width.These sparse classes are surprisingly rich since they contain certain (small) classes of expanders.Thirdly we show that cubic expanders obtained by iterated random 2-lifts from K4 [Bilu and Linial, Combinatorica '06] also have bounded twin-width.These graphs are related to so-called separable permutations and also form a small class.We suggest a promising connection between the small conjecture and group theory.
Finally we define a robust notion of sparse twin-width.We show that for a hereditary class C of bounded twin-width the five following conditions are equivalent: every graph in C (1) is Kt,t-free for some fixed t, (2) has an adjacency matrix without a d-by-d division with a 1 entry in each d 2 cells for some fixed d, (3) has at most linearly many edges, (4) the subgraph closure of C has bounded twin-width, and (5) C has bounded expansion.We discuss how sparse classes with similar behavior with respect to clique subdivisions compare to bounded sparse twin-width.

ACM Subject Classification Mathematics of computing → Discrete mathematics → Graph theory 1 Introduction
We continue to develop the theory of twin-width, a novel graph and matrix invariant introduced in the first paper of the series [6].We start with a bird's eye view of our results.The exact definitions of some objects and concepts will be deferred to the next section, but this introduction can be read by taking them as black boxes.Furthermore Section 2 includes a summary of the first paper, so that the current paper is self-contained.
A trigraph is a graph with two disjoint edge sets: black edges (regular edges) and red edges (error edges).The graph induced by the red edges (resp.black edges) is called the red graph (resp.black graph).A d-trigraph has a red graph with maximum degree at most d.A contraction in a trigraph identifies two (non-necessarily adjacent) vertices, and puts black edges towards shared neighbors in the black graph, and red edges towards the other (non-necessarily shared) neighbors (see Figure 1).A d-contraction sequence, or d-sequence, of an n-vertex graph G is a sequence of d-trigraphs G = G n , G n−1 , . . ., G 2 , G 1 such that G i is obtained by performing a single contraction in G i+1 .In particular G 1 is the one-vertex graph K 1 .The twin-width of G is the minimum d such that it admits a d-sequence.
A contraction sequence of G may be seen as a path with at the left end, G, at the right end, K 1 , and the current trigraph gets smaller and smaller when we walk from left to right.We show that this path can be made a tree of large arity.Now G is at the root of the tree, all the leaves contain the graph K 1 , and every child is obtained by performing a single contraction in the parent node.A d-contraction tree is such a tree with a d-trigraph at every node.More precisely, we show that if a graph G has a d-contraction sequence, then it has a D d -contraction tree with linear arity.By linear arity, we mean that every non-leaf node H has Θ(|V (H)|) distinct children.
Denoting the class of graphs with twin-width at most d by C d , the first consequence is that the number of graphs in C d on the vertex set [n] is at most n!f (d) n .Intuitively the large-arity tree tells us that many n − 1-vertex graphs of C d can be obtained from the same n-vertex graph of C d .By inverting the process, there are not so many distinct n-vertex graphs in C d , obtained by splitting a vertex in n − 1-vertex graphs of C d .This crucial fact makes the inductive proof works.Our result generalizes several similar theorems in enumerative combinatorics.
The first one is an over 50-year old result that bounded treewidth graphs on vertex set [n] have a similar growth in n!c n [2].Graph classes with such a growth are called small.The second one is comparatively much more recent, it is the celebrated answer to the Stanley-Wilf conjecture, now the Marcus-Tardos theorem.Marcus and Tardos [20] showed that there are at most c n σ permutations over [n] avoiding a fixed permutation pattern σ.In other words, every proper subclass of permutations (where a class of permutations is closed under taking subpermutations) has at most single-exponential growth, much below n!, the growth of the full class.Expressed in the language of graph classes, proper subclasses of permutation graphs are small.The third one, due to Norine et al. [24], is that the number of graphs on vertex set [n] not containing a fixed minor H is at most n!c n H . Thus proper minor-closed classes are small.We previously showed [6] that bounded treewidth (even rank-width) graphs, proper subclasses of permutation graphs, and proper minor-closed classes have bounded twin-width.Thus the fact that bounded twin-width classes are small unifies and extends all the abovementioned theorems.We then explore the converse statement.Could it be that every small hereditary class has bounded twin-width?We do not answer this question, dubbed the small conjecture, but instead we give some evidences it may be true.This comes in the form of showing that many potential counterexamples, that is, seemingly complex small hereditary classes, actually have bounded twin-width.If the conjecture is true, it gives a universal explanation for the single-exponential growth (up to isomorphism) of combinatorial classes: Translate the objects into graphs or matrices, a bound or lack thereof in the twin-width of the class decides the existence of such a bound in the growth.
Another by-product of the contraction tree is that we can always contract in parallel a linear number of disjoint pairs of vertices.This gives rise to so-called parallel d-sequences of logarithmic length.This will be instrumental in showing that bounded twin-width classes admit an O(log n)-adjacency labeling scheme.This verifies a variety of particular cases of the implicit graph conjecture which posits that such labeling schemes exist for every factorial hereditary class, i.e., hereditary class with growth n! O (1) .
Finally we show that five different ways of restricting twin-width to sparse classes actually lead to the same notion.For example, bounded sparse twin-width classes can be equivalently defined as hereditary classes with bounded twin-width that are K t,t -free or where every graph has at most linearly many edges.A first but challenging step towards the small conjecture is to show that small sparse classes have bounded (sparse) twin-width.For instance, do classes with polynomial expansion have bounded twin-width?We discuss (possible) containments and strict containments of established sparse classes with respect to bounded sparse twin-width.

Preliminaries and outline
In this section we recall the relevant notations and definitions, summarize the important bits of the first paper, and outline our new results.

Notations and definitions
We denote by [i, j] the set of integers {i, i + 1, . . ., j − 1, j}, and by [i] the set of integers [1, i].
If X is a set of sets, we denote by ∪X their union.Unless stated otherwise, all graphs are assumed undirected and simple, that is, they do not have parallel edges or self-loops.We denote by V (G) and E(G), the set of vertices and edges, respectively, of a graph G.
is the number of edges in a shortest path from u to v, and ∞ if u and v are in two distinct connected components of G.In all the notations with a graph subscript, we may omit it if the graph is clear from the context.A graph class is a family of graphs closed under isomorphism (i.e., under renaming the vertices).Since we will be interested in the "size" of a class, we will further impose that the vertex set of n-vertex graphs is precisely 1 [n].With that requirement the number of n-vertex graphs in a class C is a well-defined (finite) number.Observe that every single n-vertex graph in a class C implies that at least n! graphs are in C, namely all its relabelings.A graph class is said hereditary if it is closed under taking induced subgraphs.It is said monotone or subgraph-closed if it is even closed under taking subgraphs.
A graph is H-free if it does not contain H as an induced subgraph.However we make an exception for H = K t,t .A K t,t -free graph is a graph with no biclique K t,t as a subgraph.A class is H-free if all its graphs are H-free.When t is not yet defined, we may say that a class C is K t -free (resp.K t,t -free) to mean that there exists a finite integer t such that C is K t -free (resp.K t,t -free).
We denote by ∆(G) the maximum degree of a vertex in G, and An edge contraction of two adjacent vertices u, v consists of merging u and v into a single vertex adjacent to N ({u, v}) (and deleting u and v).A graph H is a minor of a graph G if H can be obtained from G by a sequence of vertex and edge deletions, and edge contractions.Equivalently a minor H with vertex set say, . Indeed after contracting each B i into a single vertex (which is possible since they induce connected subgraphs), H appears as a subgraph.The set B i is called the branch set of v i ∈ V (H).A graph G is said H-minor free if H is not a minor of G.A class is said minor-closed if every minor of a member of the class is in the class, and proper minor-closed if further the class is not the set of all graphs.
The radius rad(G) of a graph G is defined as min u∈V (G) max v∈V (G) d G (u, v).The radius rad G (S) of a subset of vertices S ⊆ V (G) is simply defined as rad(G[S]).Note that two vertices can be further away in We denote that by H r G.In particular 0-shallow minors correspond to subgraphs.The theory of graph sparsity pioneered by Ossona de Mendez and Nešetřil [23] introduces the following invariants for a graph G and a class C: A class has polynomial expansion if it has expansion f for a polynomial function f .Proper minor-closed classes even have constant expansion, i.e., expansion f for a constant function f .

Summary of the previous paper
In the previous paper of the series [6], we introduced a new graph and matrix invariant dubbed twin-width, inspired by the work of Guillemot and Marx on permutations [17].We proved that many classes such as, bounded rank-width graphs, proper minor-free classes, proper subclasses of permutation graphs, and posets with antichains of bounded size have bounded twin-width.For all these classes, we showed how to find in polynomial-time a so-called d-sequence, witnessing that the twin-width is at most a constant d.Finally given a d-sequence of a binary structure G on n elements and a first-order (FO) formula ϕ of quantifier-depth , we provided an FO model checking algorithm deciding G |= ϕ in time f (d, )n.We start by recalling the definition of twin-width, and then we summarize the milestones of [6] that will also be useful in the current paper.

Trigraphs, contraction sequences, and twin-width of a graph
A trigraph G has vertex set V (G), (black) edge set E(G), and red edge set R(G) (the error edges), with E(G) and R(G) being disjoint.The set of neighbors N G (v) of a vertex v in a trigraph G consists of all the vertices adjacent to v by a black or red edge.A d-trigraph is a trigraph G such that the red graph (V (G), R(G)) has degree at most d.In that case, we also say that the trigraph has red degree at most d.In the context of trigraphs and twin-width, we will somewhat overload the term "contraction".A contraction or identification in a trigraph G consists of merging two (non-necessarily adjacent) vertices u and v into a single vertex w, and updating the edges of G in the following way.Every vertex of the symmetric difference N G (u) N G (v) is linked to w by a red edge.Every vertex x of the intersection N G (u) ∩ N G (v) is linked to w by a black edge if both ux ∈ E(G) and vx ∈ E(G), and by a red edge otherwise.The rest of the edges (not incident to u or v) remain unchanged.We insist that the vertices u and v (together with the edges incident to these vertices) are removed from the trigraph.See Figure 1 for an illustration. A the graph on a single vertex, and G i−1 is obtained from G i by performing a single contraction of two (non-necessarily adjacent) vertices.We observe that G i has precisely i vertices, for every i ∈ [n].The twin-width of G, denoted by tww(G), is the minimum integer d such that G admits a d-sequence.Going back to the overload of the word "contraction", in case we actually refer to the classical (edge) contraction, either we will use the term "edge contraction", or it will be clear from the context what is meant.

Partitions, divisions, red number, and twin-width of a matrix
We now give two equivalent definitions for the twin-width of a matrix.The first is based on a contraction sequence where we progressively reduce the size of the matrix, and introduce error symbols r.The second (equivalent) definition is based on a coarsening sequence where we progressively coarsen a partition of the rows and columns of the matrix.
The red number of a matrix is the maximum number of r entries (error entry, the r stands for red) in a single row or column.Given an n × m matrix M and two columns C i and C j (resp.two rows R i and R j ), the contraction of C i and C j (resp.R i and R j ) is obtained by deleting C j (resp.R j ) and replacing every entry m k,i of C i (resp.every entry m i,k of R i ) by r whenever m k,i = m k,j (resp.m i,k = m j,k ).A d-contraction sequence of matrix M is sequence of successive contractions starting at M , ending at some 1 × 1 matrix, such that all matrices of the sequence have red number at most d.The twin-width of a matrix M is the smallest integer d such that M admits a d-contraction sequence.
We observe that when M has twin-width at most d, one can reorder its rows and columns such that every contraction is on two consecutive rows or two consecutive columns.The reordered matrix is then called d-twin-ordered.The symmetric twin-width of an n × n matrix M is defined similarly, except that the contraction of rows i and j (resp.columns i and j) is immediately followed by the contraction of columns i and j (resp.rows i and j).The symmetric twin-width of the adjacency matrix of a graph G corresponds to the twin-width of G.
For the second definition of the twin-width of a matrix, we need to introduce a bit of vocabulary on partitions.We say that a partition P of a set S refines a partition P of S if every part of P is contained in a part of P .Conversely we say that P is a coarsening of P. We will further assume that a coarsening is proper, that is, P and P are distinct.Given a partition P and two distinct parts P, P of P, the elementary coarsening of P and P yields the coarsening P \ {P, P } ∪ {P ∪ P }.Informally an elementary coarsening is the merge of two parts.
Given an n × m matrix M , we call row-partition (resp.column-partition) a partition of the rows (resp.columns) of M .A (k, )-partition, or simply partition, of a matrix M is a pair where R is a row-partition and C is a column-partition.In a matrix partition (R, C), each part R ∈ R is called a row-part, and each part C ∈ C is called a column-part.An elementary coarsening of a partition (R, C) of a matrix M is obtained by performing one elementary coarsening in R or in C. We distinguish two canonical partitions of an n × m matrix M : the finest partition where (R, C) have size n and m, respectively, and the coarsest partition where |R| = |C| = 1.
A coarsening sequence of an n × m matrix M is a sequence of partitions (R 1 , C 1 ), . . ., (R n+m−1 , C n+m−1 ) where (R 1 , C 1 ) is the finest partition, (R n+m−1 , C n+m−1 ) is the coarsest partition, and for every Given a subset R of rows and a subset C of columns in a matrix M , the zone R ∩ C denotes the submatrix of all entries of M at the intersection between a row of R and a column of C. A zone of a matrix partitioned by (R, A zone is constant if all its entries are identical.The error value of a row-part R i (resp.a column-part C j ) is the number of non-constant zones among all zones in The error value of (R, C) is the maximum error value of a part, taken over all parts R i and C j .Now the twin-width of a matrix M can be equivalently defined as the minimum d for which M admits a coarsening sequence in which all partitions have error value at most d.
We will work with particular partitions, called divisions, where every part consists of a set of consecutive rows, or a set of consecutive columns.If the matrix is d-twin-ordered, there is a coarsening sequence with error value at most d, in which all the partitions are divisions.We call division sequence such a coarsening sequence.

Grid minor theorem for twin-width
A (t, t)-division is a division (R, C) such that |R| = |C| = t.A t-grid minor is a (t, t)division whose t 2 zones contains a non-zero entry.As for the Permutation Pattern breakthrough algorithm of Guillemot and Marx [17], a crucial engine of twin-width is the following celebrated theorem by Marcus and Tardos.
Theorem 1 ([20]).For every integer t, there is some c t such that every n × m 0, 1-matrix M with at least c t max(n, m) entries 1 has a t-grid minor.
Informally, if a matrix has sufficiently many entries 1, then there is a large grid structure where each cell is "complicated".The current best bound for c t , due to Cibulka and Kynčl [8], is 8/3(t + 1) 2 2 4t .
To leverage Marcus-Tardos theorem in the dense regime, too, we modify the definition of "complicated" from "containing a 1" to "being mixed".A zone is horizontal if all its columns are equal (restricted to the zone), and vertical if all the rows are equal.Equivalently each row (resp.column) within a horizontal zone (resp.vertical zone) consists of a repeated same entry.Note that a zone is constant (consists of a same entry repeated) if it is horizontal and vertical.A zone is mixed if it is not horizontal nor vertical.
We can now introduce the notions of t-mixed minors and t-mixed freeness.A t-mixed minor of a matrix M is a (t, t)-division of M such that every zone is mixed.A matrix is t-mixed free if it does not admit a t-mixed minor.We showed that having small twin-width and admitting no large mixed minors are equivalent in the following sense.
The first item is a relatively simple observation.The difficulty lies in the second item.In a nutshell, if the matrix is t-mixed free, we find, using Marcus-Tardos theorem, a sequence of divisions with small number of mixed zones per column and per row.From this favorable sequence of divisions, we are able to extract an f (t)-contraction sequence.
One simple but important ingredient is a local characterization of mixedness by means of a corner.A corner in a matrix M = (m i,j ) i,j is a mixed zone made by four contiguous entries m i,j , m i+1,j , m i,j+1 , m i+1,j+1 .A 0,1-corner is a corner where each entry is in {0, 1}.

Lemma 3 ([6]). A matrix is mixed if and only if it contains a corner.
In Section 3 we will work with specifically divided 0, 1, r-matrices, respecting the following invariants.Every zone is filled with r entries, or is non-mixed (that is, horizontal or vertical) and has only 0 and 1 entries.In this context, we will redefine the mixed zones as those filled with r entries.The coarsenings will be followed by updating the entries of the matrix to keep the invariants.Namely every zone with a 0, 1-corner is filled with r entries.This new viewpoint mixes contraction sequence and coarsening sequence.It will turn out useful to find, in a t-mixed free matrix, not just one "good contraction" (as in Theorem 2) but a linear number of disjoint pairs of "good contractions".This will have two main consequences.It will enable us to show that bounded twin-width classes are small (see Section 2.3 for a formal definition).This will also be used to find O(log n)-bits adjacency labeling schemes (see Section 2.4) for n-vertex graphs in classes of bounded twin-width.

Closure by FO transduction
Bounded twin-width behaves surprisingly well with respect to first-order logic.In addition to the fixed-parameter tractable algorithm running in time f (d, |φ|)n for model checking a first-order sentence φ on an n-vertex graph given with a d-contraction sequence, we show that bounded twin-width is preserved by first-order (FO) transductions.

Theorem 4 ([6]). Every transduction of a bounded twin-width class has bounded twin-width.
A formal definition of FO transductions can be found in several papers (see for instance [4,6]).As this definition is somewhat lengthy and technical and we will only use Theorem 4 in a black-box fashion, we refer the interested reader to these papers.Informally an FO transduction of a graph G defines several new graphs.It consists of a non-deterministic "coloring" of V (G) by a constant number of unary relations, followed by a redefinition of the edges by means of a fixed FO formula using the former edge predicate as well as these new unary relations.The unary relations are then discarded, and we here further allow to take any induced subgraph of the obtained graph (to preserve the class heredity).An FO transduction of a class C is simply the union of the graphs obtained by FO transduction of G, for every G ∈ C.

Small classes and the small conjecture
We recall that a hereditary class is a class closed under taking induced subgraphs.Formally if G is in a hereditary class C, then for every induced subgraph H of G, it also holds that H is in C. The overwhelming majority of the usually considered classes of graphs are hereditary. 2 class of graph C is said small (resp.factorial), if there exists a constant c, such that the number of n-vertex graphs of C is at most n!c n (resp.n! c = 2 O(n log n) ), for every n ∈ N. Recall that our n-vertex graphs are all assumed to be on the vertex set [n], and that we count up to equality and not up to isomorphism.Norine et al. [24] show that the number of K t -minor free graphs on [n] is at most n!c n , for some integer c depending only on t.In other words, proper minor-closed classes are small.Marcus-Tardos theorem [20], combined with an argument due to Klazar [19], implies that the number of n × n 0, 1-matrices avoiding a fixed permutation submatrix is at most c n , for some constant c.In particular the number of permutations on n elements avoiding a fixed permutation grows in 2 O(n) .A translation of this result to graphs is that proper subclasses of permutation graphs are small.
We say that a class C has bounded twin-width if there exists an integer d C such that every member of C has twin-width at most d C .Thus tww(C) := sup G∈C tww(G) < ∞.
One of the main contributions of the paper is the following.

Theorem 5. Every class with bounded twin-width is small.
This generalizes the smallness of proper minor-closed classes [24], proper subclasses of permutation graphs [20,19], and graphs with bounded treewidth [2], as we previously showed that all these classes have bounded twin-width [6].We then explore a possible converse for Theorem 5. Of course it is easy to artificially build an unbounded twin-width class with only n! graphs of size n.For example, by taking in the class a single (up to isomorphism) n-vertex graph among the n-vertex graphs with maximum twin-width, for every n.However this is not a satisfactory counterexample.In combinatorics, classes of objects are often required to be closed under substructures.For instance, a class of permutations is by definition closed under taking subpermutations.Same goes for graphs: Hereditary classes have richer properties than non-hereditary ones.Many interesting questions on hereditary classes have trivial answers or are not even well-defined on general classes.
We provocatively conjecture the following converse of Theorem 5.
It may seem ambitious to expect that the converse of Theorem 5 holds for hereditary classes.Why would the mere limited number of graphs guarantee anything close to a d-contraction sequence?A typical example of a class with unbounded twin-width contains an infinite sequence of graphs G 1 , G 2 , . . .where every distinct pair Indeed any first contraction in G i creates a vertex with red degree at least i.A class is said to have unbounded symmetric difference if it contains such a sequence, and bounded symmetric difference, otherwise.So for every class C with bounded symmetric difference, there is an integer d such that for every graph G ∈ C, there exist two distinct vertices u, v ∈ V (G) satisfying |N (u) N (v)| d.For example, the i × i rook graphs (with vertex set [i] × [i] and an edge between (a, b) and (c, d) if a = c or b = d), with i 3, is a class with unbounded symmetric difference.However the hereditary closure of this class is not small.
Having bounded symmetric difference is a prerequisite to having bounded twin-width.A first step towards Conjecture 6 would be to show that small hereditary classes have bounded symmetric difference.Even that is unclear.For K 2,2 -free classes or classes with girth at least 5, bounded symmetric difference simply implies bounded minimum degree.Thus a very particular case of Conjecture 6 is that there every small K 2,2 -free hereditary class has bounded minimum degree.
Let us present some elements supporting the conjecture.First and foremost, bounded twin-width seems to "stop at the right place" in the sparse and dense realms.Unit interval graphs (a small class) have bounded twin-width while interval graphs (a non-small class) do not.Similarly among sparse classes, proper minor-closed classes (small) have bounded twin-width, whereas subcubic graphs (non-small) have unbounded twin-width.We will also see that some expander classes have bounded twin-width (and are small), unlike random cubic graphs.
An interesting test is the case of the s-subdivisions (where each edge of a graph is subdivided s 1 times).Since the number of subcubic graphs on [n] is n 3n/2+O(n/ log n) , the o(log n)-subdivisions of subcubic graphs is still a non-small class.Thus by Theorem 5, they have unbounded twin-width.We show a more fine-grained version of that fact by a direct proof.We also build in polynomial time O(1)-sequences for Ω(log n)-subdivisions of K n , which yields the following.

Theorem 7. The s-subdivision of K n has bounded twin-width if and only if s = Ω(log n).
More precisely, for every integer d, there are d < u d such that the c log n -subdivision of K n has twin-width at least d for every 1 c d , and at most d for every c u d .The hereditary closure of Ω(log n)-subdivisions of K n is indeed a small class.But Theorem 7 in particular implies that this class does have bounded twin-width.Dvořák and Norine [15] show that, for any constants c, ε > 0, classes with expansion r → c r 1/3−ε are small, while the class of all graphs with expansion r → 6 • 3 √ r log (r+e) is not small.If the small conjecture is true, then bounded twin-width contains polynomial expansion (actually even expansion r → 2 r 0.33 ).Thus another possible first step to Conjecture 6 is to show that bounded twin-width classes have polynomial expansion.
A supplementary motivation for the small conjecture appears if its proof is algorithmic, that is, yields on any small hereditary class a polytime algorithm which takes any graph of the class and outputs a (non-necessarily optimal) O(1)-sequence.In light of Theorem 5 and considering that ω(1)-sequences are not as algorithmically useful, that would be almost as good as a constant approximation of twin-width in general graphs.

Implicit representations
A class C has an f (n)-bits adjacency labeling scheme (or simply labeling scheme, for short) if there is a decoding function A : {0, 1} * × {0, 1} * → {0, 1} such that for every n-vertex graph G ∈ C there is a labeling function : Here we will further impose that the labeling function is injective.For example trees now have log n + O(1)-bits adjacency labeling scheme [1], which up to the constant term, is optimal.It is known that a class C has a c log n-bits adjacency labeling scheme if and only if, for every integer n, there is a universal graph graph U n (not necessarily in C) on at most n c vertices such that every n-vertex graph of C is an induced subgraph of U n (see for instance [27]).This becomes apparent when one considers the possible labels as the vertex set of the universal graph.
Several classes, such as interval graphs and K t -minor free graphs, are known to have O(log n)-bits labeling schemes.By a direct counting argument, only factorial classes can expect to admit O(log n)-bits labeling scheme.Indeed the number of distinct labels is 2 O(log n) = n O (1) .Thus the number of n-vertex graphs that can be induced subgraphs of the universal graph is only n O( 1) n = n O(n) .The implicit graph conjecture asserts that every factorial hereditary class has an O(log n)-bits labeling scheme [18].We show the conjecture in the particular case of bounded twin-width classes.

Theorem 8. Every bounded twin-width class admits an O(log n)-bits labeling scheme.
This result is at the same time quite strong and quite weak.Its strength lies in its broad generality.We produce a unified labeling scheme for very different sparse and dense classes.However there are two caveats, both linked to its generality.The first one is that we still do not know if the labeling function can be computed in polynomial time.Indeed it requires a d-sequence (even a so-called parallel D-sequence of logarithmic length).If we know how to compute this sequence in many bounded twin-width classes, we do not know in the full generality of all the graphs with twin-width at most d.In the latter case, we currently need exponential time to find the sequence, and then to compute the labeling.The decoding function, that is the adjacency test, runs in time O(log n) in the RAM model with unit-cost arithmetic operations over words of logarithmic length.The second caveat is that when restricted to particular classes, the multiplicative constant preceding log n given by our proof is much larger than in the shortest known labeling schemes.For instance, the current best labeling scheme for K t -minor free graphs requires 2 log n + o(log n) bits per vertex [16], while our multiplicative constant is double-exponential in t.
Improving the constant c of existing (c + o( 1)) log n-bits labeling schemes is topical in implicit representations.Recently planar graphs were shown to admit a (1 + o( 1)) log n-bits adjacency labeling scheme [9].It is optimal up to the second-order term.The labeling scheme is actually more general, and works for all subgraphs of strong products H P where H is a bounded-treewidth chordal graph (or k-tree, for some fixed k), and P is a path.A class C is said flat if there is an integer k such that C ⊆ Sub(H P) where P is the set of all paths, and H is a set of graphs with treewidth at most k.An ongoing program (not specific to adjacency labeling schemes), dubbed graph product structure theorem, established that many small and sparse classes are flat.This was initiated by a paper by Pilipczuk and Siebertz [25] showing a similar result for planar graphs.This property was extended to apex-minor free [11], bounded-degree minor-free [10], and k-planar classes [12].Hence they all enjoy a (1 + o( 1)) log n-bits adjacency labeling scheme.Interestingly all these classes have bounded twin-width (minor-free classes and k-planar graphs have bounded twin-width [6]).This is no coincidence.We will see that the strong product of two bounded twin-width graphs, one of which has bounded degree, has bounded twin-width.

Theorem 9. Let G and H be two graphs. Then tww(G H) max{tww(G)(∆(H) + 1) + 2∆(H), tww(H) + ∆(H)}.
As cliques have twin-width 0, taking subgraphs does in general not preserve twin-width at all.Nevertheless on "sparse" classes, bounded twin-width is subgraph-closed.We show that if the strong product of a bounded twin-width class G with a bounded-degree bounded twin-width class H is K t,t -free, then the subgraphs of G H have bounded twin-width.
In particular flat classes have bounded twin-width (since graphs with bounded treewidth have bounded twin-width, and flat classes are K t,t -free).By essence, the "flat class" approach to (1+o(1)) log n-bits labeling scheme is limited to classes that are K t -free.Another interesting limit case is minor-free classes which are not apex-minor free, like all the K 6 -minor free graphs for example.Dujmović et al. [11] show that these classes are not flat.
We hope that the versatile tree of contractions (see Lemma 20) or the short parallel contraction sequence (see Lemma 23) may help for small dense classes and K t -minor free graphs.We optimistically conjecture that our Theorem 8 can be improved to an optimal labeling scheme up to the second-order term.

Sparse twin-width
The trace of bounded twin-width on sparse classes is also an interesting and potentially new class.There are five natural ways of forcing a bounded twin-width class to be "sparse": forbidding K t,t as a subgraph, forbidding a d-grid minor in its adjacency matrix (and not a mere d-mixed minor), requiring that every graph has bounded average degree, requiring that the subgraphs also have bounded twin-width, and requiring that the class has bounded expansion.Let A σ (G) denote the adjacency matrix of G when V (G) is ordered by σ.We say that a class C is d-grid free if for every G ∈ C there is an ordering We show that all five definitions are actually equivalent.(v) There is a function f such that ∇ r (C) f (r) for every r.
Ignoring item (iv), a compact version of this theorem reads: For a hereditary class of bounded twin-width having bounded grid minors, bicliques, average degree, or expansion are all equivalent.
Thus we say that a hereditary class has bounded sparse twin-width if it has bounded twin-width and satisfies any of the five items (that is, satisfies all five).One may wonder whether bounded sparse twin-width coincides with some existing sparse class.More generally it is interesting to see how bounded sparse twin-width compares to the established sparse classes.A few candidates come to mind: polynomial expansion, bounded expansion, bounded queue number, bounded stack number, bounded nonrepetitive coloring classes.Although we do not prove it for bounded queue or stack number, we argue that these classes do not coincide with bounded sparse twin-width.
As cubic graphs have unbounded twin-width, bounded expansion is strictly more general than bounded sparse twin-width.For the same reason, bounded nonrepetitive coloring does not imply bounded sparse twin-width.It is possible however that bounded sparse twin-width classes have bounded nonrepetitive coloring.The existence of an infinite family of cubic expanders with bounded twin-width implies that bounded sparse twin-width classes do not necessarily have polynomial expansion.If the small conjecture is true, polynomial expansion would be a strict subset of bounded sparse twin-width.We will show that classes with bounded queue number or bounded stack number have bounded (sparse) twin-width.We believe that this inclusion is strict and that the expanders based on random 2-lifts have unbounded queue and stack numbers.

Organization of the rest of paper
In Section 3 we show Theorem 5, that every class of bounded twin-width is small.From this we conclude that non-small classes such as subcubic graphs, interval graphs, and triangle-free unit segment graphs have unbounded twin-width.This can be respectively put in perspective with the fact that some cubic expanders (as we see in Section 5), unit interval graphs, and K t -free unit d-dimensional ball graphs, have bounded twin-width [6].In Section 4 we leverage the results from the previous section to present O(log n)-bits adjacency labeling schemes on bounded twin-width classes.We then explore the converse of Theorem 5 for hereditary classes.In Section 5 we show that the small class of cubic expanders obtained by iterated 2-lifts from K 4 has indeed bounded twin-width.In Section 6 we prove that the s-subdivision of the clique K n , with s > 0, has bounded twin-width if and only if s = Ω(log n).In Section 7 we prove Theorem 12, the list of characterizations of bounded sparse twin-width.We then show that flat classes, and classes with bounded queue or stack number have bounded (sparse) twin-width.In Section 8 we investigate the twin-width of the finite induced subgraphs of a fixed Cayley graph.We show that such classes are small for every finitely generated group.This is a rare example of a small class for which we still do not know if the twin-width is bounded.

Bounded twin-width classes are small
In this section we show that graphs of bounded twin-width have bounded versatile twin-width.
Informally it says that whenever we can find a sequence (or path) of d-contractions, we can even find a tree of D-contractions with linear arity, for some D bounded by a function of d.This result is fairly technical but shares some ideas and arguments with Section 5 of our previous paper [6].We made the current section self-contained.We nevertheless mention some frequent parallels with [6].Finally we can follow the end of the proof of Norine et al. [24] -that proper minor-closed classes are small-to extend the result to bounded twin-width.

The proof for proper minor-closed classes and how (not) to tune it
).We need to redefine the notion of being d-good for bounded twin-width classes.A very natural candidate for that would be to say that a vertex is d-good if it admits a d-contraction with another vertex.After all, there is always such a vertex (or such a pair of vertices) in a d-trigraph.However, we cannot expect d-trigraphs to have linearly many such vertices.Think for instance of a path on n vertices.It has twin-width 1, but only four vertices (the two endpoints and their neighbor) that can be contracted to yield a 1-sequence.Surely we could allow mere D-contractions, for some D d, but then we would leave the class of d-trigraphs.So it would be unclear which class we are bounding the size of.It is indeed noteworthy in the above sketch that by deleting a vertex or contracting adjacent vertices, one remains in the class of K t -minor free graphs.
To overcome that issue, we introduce a more robust notion of bounded twin-width.A tree of d-contractions of a d-trigraph G is a rooted tree, whose root is labeled by G, and whose leaves are all labeled by 1-vertex graphs K 1 , and such that one can go from any parent to any child by a d-contraction.With this new definition, d-sequences coincide with trees of d-contractions which are in fact paths.We say that a trigraph G has versatile twin-width d if there exists some p, function of d only, such that G admits a tree of d-contractions in which every internal node has at least |V (•)|/p children with distinct labels (where |V (•)| denotes the number of vertices of the corresponding node label).Such a tree is then called a versatile tree of d-contractions.
Let us say that a contraction is d-correct (or simply correct when we precise that it is a d-contraction) if the obtained graph has twin-width at most d.The inductive nature of versatile twin-width provides us the desired stability.Not only G admits linearly many correct d-contractions, but it admits linearly many d-contractions towards graphs of versatile twin-width d.This is indeed witnessed by the subtrees rooted at each child of the root labeled by G.We now focus on proving that every trigraph with twin-width d has a versatile tree of D-contractions, for a larger D function of d only.This is a bit technical, but once it is done, we will be able to mimic the end of Norine et al.'s proof.

Neatly divided symmetric 0, 1, r-matrices
Recall that r (for red) is the error symbol.It will now be convenient to tune some of the notions developed in our previous paper specifically for 0, 1, r-matrices with particular divisions.The notions introduced without a definition are all formalized in Section 2 of the present paper, as well as in [6, Section 5].Reading first [6, Section 5] does not harm, but it is not necessary to understand the current section.
We will manipulate divisions of 0, 1, r-matrices such that every zone either contains only r entries or contains no r entry and is horizontal or vertical (or both).Let us call neat such a division.Zones filled with r entries are now called mixed.A neatly divided matrix is a pair (M, (R, C)) where M is a 0, 1, r-matrix and (R, C) is a neat division of M .A t-mixed minor in a neatly divided matrix is a (t, t)-division which coarsens the neat subdivision, and contains in each of its t 2 zones at least one mixed zone (filled with r entries) or a 0,1-corner.See Figure 2 for an illustration.A neatly divided matrix is said t-mixed free if it does not admit a t-mixed minor. A and there is a 0, 1-corner in the 2-by-|R| zone defined by the last column of C i , the first column of C i+1 , and R. Importantly, a mixed cut cannot border a mixed zone.(This is a difference with the definition of [6, Section 5].) The mixed value of a row-part R ∈ R of a neat division (R, C = {C 1 , C 2 , . ..}) is the number of mixed zones R ∩ C j plus the number of mixed cuts between two (adjacent non-mixed) zones R ∩ C j and R ∩ C j+1 .Note that a mixed cut counts for one unit in the mixed value, regardless of the number of corners overlapping the two adjacent zones.We similarly define the mixed value of a column-part C ∈ C. The mixed value of a neat division of a 0, 1, r-matrix is the maximum of the mixed values taken over every part.The part size of a division (resp.partition) (R, C) is defined as max(max R∈R |R|, max C∈C |C|).A division is symmetric if the largest row index of each row-part and the largest column index of each column-part define the same set of integers, that is informally, if the horizontal separations are symmetric of the vertical separations about the main diagonal.For instance the division depicted on Figure 2 is symmetric since both the largest row indices of the row-parts and the largest column indices of the column-parts define the set {2, 3, 4, 6}.We call symmetric fusion of a symmetric division the fusion of two consecutive parts in C and of the two corresponding parts in R. A symmetric fusion on a symmetric division yields another symmetric division.A matrix A := (a i,j ) i,j is said symmetric in the usual sense, namely, for every entry a i,j of A, a i,j = a j,i .To the left, a neat division: each zone is horizontal, or vertical, or full with r entries (mixed zone).Note that the division is symmetric but not the matrix.To the right, in bold, a 3-mixed minor of the neat division.Observe that it coarsens the neat division and contains in each of its 9 zones either a 0,1-corner or a mixed zone (framed by red dashed boxes).
The following definition is crucial.It lists the invariants that we want to keep in our neatly divided matrices in order to build a versatile tree of contractions.In the previous definition, c d := 8/3(d + 1) 2 2 4d as defined in the improvement of Marcus-Tardos bound [8].The conditions of the first and second bullets are enough to bound the red number of a neatly divided matrix of M n,d .
Proof.Any row or column intersects at most 4c d mixed zones (filled with r entries).Each mixed zone has width and length bounded by the part size 2 4c d +2 .Hence the maximum total number of r entries on a single row or column is at most 4c d • 2 4c d +2 .

Finding invariant-preserving coarsenings
A coarsening of a neatly divided matrix (M, (R, C)) is a neatly divided matrix (M , (R , C )) such that (R , C ) is a coarsening of (R, C), and M is obtained from M by setting to r all entries that lie, in M divided by (R , C ), in a zone with at least one r entry or a 0,1-corner.We also refer to the process of going from (M, (R, C)) to (M , (R , C )) as coarsening operation (or simply coarsening).A coarsening operation from (M, (R, , and elementary if it consists of a single symmetric fusion.The following lemma shows that not having a t-mixed minor is preserved for free in coarsenings of neatly divided matrices.There are two possibilities for a zone Z of (R * , C * ) in (M , (R , C )).Either it contains a 0, 1-corner, but then, Z contains the same 0, 1-corner in (M, (R, C)).This is because the coarsening operation of a neatly divided matrix never replaces entries by 0 or 1 entries (we may only add r entries).Or Z contains an r entry, or more precisely a zone Z ⊆ Z of (M , (R , C )) filled with r entries.Either one of these r entries was already present in (M, (R, C)), or the r entries of Z appear after the fusion of a zone Therefore, in any case, Z in (M, (R, C)) contains an 0, 1-corner or an r entry.We conclude that (R * , C * ) is a t-mixed minor of (M, (R, C)).
The previous lemma will in particular give us some control on the average mixed value among the parts of a coarsening of a neatly divided matrix in M n,d .This turns out crucial to find a coarsening which preserves the imposed upper bound on the overall mixed value.Let us assume by contradiction that the average mixed value γ, taken among every part C ∈ C on (M , (R, C )) is strictly greater than 2c d .We consider two coarsenings of (M Let C ∈ C be any part with a mixed zones and b mixed cuts on (M , (R, C )).Let a 1 , respectively a 2 , be the number of mixed zones of C on (M , (R 1 , C )), respectively on (M , (R 2 , C )).We claim that a + b a 1 + a 2 .
Indeed we can design the following injection from the mixed zones and cuts of C on (M , (R, C )) to the mixed zones of C on (M , (R 1 , C )) and on (M , (R 2 , C )).We order the mixed zones and cuts of C on (M , (R, C )) from top to bottom, say, x 1 , x 2 , . . ., x a+b .For i going from 1 to a + b, we attribute x i to (R 1 , C ) or to (R 2 , C ) based on the following rules.If x i is a mixed cut, there is a unique j ∈ {1, 2} such that x i is contained in a mixed zone of C on (M , (R j , C )), so we map x i to this mixed zone.If x i is a mixed zone, we map it to the mixed zone containing x i in (M , (R 3−j , C )), where x i−1 was mapped to a mixed zone in (M , (R j , C )).This is possible since there is a zone containing x i in both (M , (R 1 , C )) and (M , (R 2 , C )).For x 1 to be well-defined, we can imagine that there is a fictitious x 0 attributed to (R 2 , C ).To see that this is indeed an injection we first need to recall that there is no mixed cut bordering a mixed zone.Suppose on the contrary that a same mixed zone Z of C on, say, (M , (R 1 , C )) has two preimages x i and x i , with i < i .If x i and x i are mixed zones, they need to be consecutive to both be in Z, hence i = i + 1.But then x i should have been attributed to (M , (R 2 , C )) according to our rules.As Z contains at most one mixed cut of C on (M , (R, C )), x i and x i cannot be both mixed cuts.Finally it is impossible that exactly one of x i , x i is a mixed zone (and the other a mixed cut), since it would imply a mixed cut incident to a mixed zone.See Figure 3 for an illustration.
Let α 1 , respectively α 2 , be the average, taken among every C ∈ C , of the number of mixed zones on (M , (R 1 , C )), respectively (M , (R 2 , C )). Summing up the last inequality for every C ∈ C , it holds that γ α 1 + α 2 .Thus α 1 + α 2 > 2c d .Without loss of generality, we assume that So by Marcus-Tardos theorem (Theorem 1) applied to the 0, 1-matrix with as many entries as zones of (R 1 , C ), and a 1 in a mixed zone and a 0 otherwise, we obtain a d-mixed minor in a coarsening of (M, (R, C)) ∈ M n,d .This contradicts Lemma 15, since neatly divided matrices of M n,d are d-mixed free.
Finally we check again, with our slightly different definition of mixed value (compared to that of [6, Section 5]), that the column-part fusions can only decrease the mixed value of row-parts (and vice versa)., (R, C)), the border between C and C is a mixed cut for R. Thus we can charge the contribution of R ∩ C * to the mixed value in (M , (R, C )) to a unit of mixed value in (M, (R, C)).Besides, the borders of R ∩ C * cannot contribute mixed cuts for R, since the zone is mixed (recall the definition of a mixed cut for neatly divided 0, 1, r-matrices).Finally the remaining mixed zones and mixed cuts of R stayed unchanged between (M, (R, C)) and (M , (R, C )).
We are now equipped to find invariant-preserving coarsenings.Proof.We maintain a set B of parts of size at least 2 4c d +1 + 1, and refer to these parts as large.Note that a large part has more than elements, and every part of a neatly divided matrix of M n,d has at most 2 elements.A part with at most elements is called a small part.The general plan is to coarsen (R, C) by successive invariant-preserving symmetric fusions (i.e., elementary coarsening) of pairs of small parts, until |B| n/s .At that point, we will be able to find a pair of identical columns in each large part.The crux of the current lemma is to show that we can always perform a symmetric fusion and remain in the class M n,d (mainly, keep the mixed value below 4c d ), even when a small fraction of the parts can no longer be merged (mainly, because they are large).
As an important rule for the fusion, we never merge a large part with another part.We set h := |R| = |C|, and greedily find z := h − 2n/s disjoint pairs of small consecutive parts in C, say, (C 1 , C 1 ), . . ., (C z , C z ).As n 2 h, it holds that z n/(2 ) − 2n/s = 2n/s.We call frozen any part of C which is not among (C 1 , C 1 ), . . ., (C z , C z ) (because it is large or next to a large part).
Let (R, C * ) be the division resulting from the fusion of the pair of consecutive parts As h 2|C * |, the average mixed value among the parts of C * is, by Lemma 16, at most 2c d .Since z > 2n/s there are more parts C * i than frozen parts.Hence the average mixed value among the non-frozen parts of C * on (R, C * ) is at most 4c d .This means that there is a merged part C * i whose mixed value on (R, C * ), hence on (R, C ∪ {C * i } \ {C i , C i }), is at most 4c d .We perform this fusion.Every zone of C * i which is mixed is filled with r entries.This may come from the fusion of a mixed zone with any other zone, or two zones whose union has a 0, 1-corner.Immediately afterwards we perform the fusion of the corresponding two parts in R, and the similar update of the entries.If C * i is large, we add it to B.
Let us show that this elementary fusion (i.e., single symmetric fusion) is invariantpreserving.We already established that the mixed value of C * i is at most 4c d .By Lemma 17, the other mixed values have not increased, so they still do not exceed 4c d .The same applies after the symmetric fusion of two parts of R.After that elementary coarsening, the matrix and the division are still symmetric.By Lemma 15, the new neatly divided matrix is still d-mixed free.Finally because we merged two small parts in C and two small parts in R, still no part exceeds 2 = 2 4c d +2 .Hence the new neatly divided matrix is indeed still in M n,d .
We proceed with these invariant-preserving elementary fusions until B contains at least n/s parts.Let (M , (R , C )) ∈ M n,d be the neatly subdivided matrix that we eventually reach.We claim that there is a pair of identical column in each part C of B. Since the mixed value of C on (R , C ) is at most 4c d , we claim that the number of different columns is at most 2 4c d +1 = .(This part of the proof follows the second paragraph of the proof of [6,Theorem 9].) Indeed let us consider maximal blocks of consecutive (non-mixed) vertical zones C ∩ R i not separating by a mixed cut.A block ends at a mixed cut or just before a mixed zone, so there are at most 4c d + 1 such blocks.Observe that a block, seen as a single zone, is vertical (otherwise there would be a 0, 1-corner, hence a mixed cut).We also notice that outside of these blocks all the columns of C are equal, since they traverse mixed zones (filled with r entries) and horizontal zones.Finally there are only two columns within a block: all 0 entries or all 1 entries.Therefore there are at most 2 4c d +1 pairwise-distinct columns.
By definition of a large part, |C| 2 4c d +1 + 1.Thus we find two equal columns in C. Now it will become apparent why we are filling the mixed zones with r entries.This allows to simulate a contraction as a simple deletion of an equal row (and a symmetric equal column).The following lemma is straightforward and states that this operation is invariant-preserving in M •,d .

Bounded twin-width classes have bounded versatile twin-width
We can now use Lemma 18 to find linearly many pairs of vertices that can be contracted, and Lemma 19 to recurse.This will be our scheme to find a versatile tree of contractions.

Lemma 20. Every trigraph of twin-width d has versatile twin-width at most 4c
Proof.Let G be an n-vertex graph of twin-width d, and let A := A σ (G) be its adjacency matrix in an order σ compatible with a d-sequence of G.By definition A is d-twin-ordered, so by Theorem 2 it is 2d + 2-mixed free.We set d := 2d + 2, := 2 4c d +1 , s := 8 , and Then (A, (R, C)) is a neatly divided matrix of M n,d .Indeed the mixed value is 0.
We apply Lemma 18 to (A, (R, C)) and find a coarsening (A , (R , C )) ∈ M n,d with n/s disjoint pairs of identical columns (γ 1 , γ 1 ), . . ., (γ n/s , γ n/s ).These pairs of columns correspond to the pairs of vertices (a 1 , b 1 ), . . ., (a n/s , b n/s ).We now argue that, for every i ∈ [ n/s ], the contraction of a i and b i , resulting in ab i , is D-correct.First let us justify that it is a D-contraction.The red degree of ab i is bounded by the number of red entries of γ i (since we filled the mixed zones with r entries).So by Lemma 14, it is bounded by 4c d 2 4c d +2 = D.The red degree of the other vertices can increase by one, but again by Lemma 14, it does not exceed D. The contraction is D-correct.Indeed applying repeatedly Lemma 18 followed by Lemma 19 gives a sequence of D-contractions.This stops when " n/s = 0", that is n < s < D. At that point, finishing the contraction sequence in any way builds a complete D-sequence.Thus every element of M •,d has twin-width D.
Therefore we have found n/s D-correct contractions on disjoint pairs of vertices.They constitute the children of the root labeled by G in a versatile tree of D-contractions.For each i ∈ [ n/s ], by Lemma 19, we build the subtree whose root is labeled by G/a i , b i with the neatly divided matrix of M n−1,d obtained by removing to (A , (R , C )) the column γ i and its symmetric row.Thus G has versatile twin-width D.

Finishing the proof
Lemma 20 is all we need to mimic Norine et al.'s proof for K t -minor free graphs [24], as described in Section 3.1.Any graph of L n,D admits an index i ∈ [n − 1] such that the contraction of vertex n and vertex i is D-correct.Therefore any H ∈ L n,D can be obtained from a H ∈ I n−1,D and i ∈ [n − 1] by splitting i into i and a new vertex n, and by linking them to the rest of H observing the following rules.Every black edge between i and j in H forces two black edges ij and nj in H. Every red edge between i and j in H forces one of the five alternatives in H: a red edge between i and j and anything between n and j (3 alternatives: non-edge, black edge, red edge), a red edge between n and j and a black edge or a non-edge between i and j (2 alternatives).Additionally, there might be a non-edge, black edge, or red edge between i and n.In total, the number of possible graphs H is bounded by 3 .

Showing that a class has unbounded twin-width by counting
We have shown that bounded twin-width classes are small.This may be used to establish that the twin-width of some graphs is unbounded, namely if these graphs do not form a small class.It is not so easy to show that cubic graphs have unbounded twin-width by direct arguments.Theorem 21 implies this fact by a simple counting argument.A bipartite cubic graph is the disjoint union of three perfect matchings.Each matching can be defined in (n/2)!different ways, leading to at least (n/2)! 3 /3 3n/2 = n 3n/2+o(n) graphs on vertex set [n], well above n!c n = n n+o (n) .Similarly, two arbitrary total orders on [n] can be defined in (n!) 2 ways, hence cannot have bounded twin-width.
We will now define a simple class of graphs capturing two arbitrary orders.Then we will show that these graphs are representable by intervals and by unit disks, and conclude that interval graphs and unit disk graphs have unbounded twin-width.Of course we did not expect these classes to have bounded twin-width3 , since FO model checking is W [1]-hard on interval graphs [22], while the mere Maximum Independent Set is W [1]-hard on unit disk graphs [21].We give a more satisfactory proof of that fact, not using the complexity-theoretic assumption FPT = W [1].
We define the (non-hereditary) class B by its slices B n of graphs on vertex set [3n].Each graph of B n has its vertex set partitioned into three cliques of size n, say, (A, B, C).There is no edge between A and C.There are two arbitrary half-graphs between A and B, and between B and C. To build a half-graph between A and B, we first choose an order for the vertices of A, say, a 1 , a 2 , . . ., a n , and an order for Let us estimate the number of graphs in B n , ignoring the single-exponential factors such as the one required to fix the partition (A, B, C).The half-graph between A and B is defined by choosing a total order for A and a total order for B. There are n! 2 such pairs of orders.Defining the half-graph between B and C requires an additional total order for B (recall that this second ordering of B is independent of its order for the half-graph on A ∪ B) and a total order for C. Again this amounts to n! 2 .Overall there are more than n! 4 graphs in B n .Thus |B n | grows like n 4n+o(n) , while the number of bounded twin-width graphs with vertices labeled by [3n] is only at most (3n)!c 3n = n 3n+o(n) .
One can describe an unlabeled graph of B n with a single permutation σ over [n] such that b σ(i) = b i .Figure 4 shows how to realize a graph of B n as the intersection graph of intervals or as the intersection graph of unit disks, for any given permutation σ.
To the left, a representation of a graph of B5 by intervals.All intervals are obviously stacked up on a single real line, by projection on the x-axis.To the right, the same graph represented with unit disks.The permutation σ associated to the graph is 41532.In both representations, one can read out the permutation matrix of 41532, where the first row is the bottom one, not the top one.For the intervals, this permutation matrix appears in the small gaps between the intervals of B and C, while for the unit disks the matrix appears in the centers of the disks of B.
Unit d-dimensional ball intersection graphs with bounded clique number have bounded twin-width [6].One could wonder if K t -free string graphs have bounded twin-width.Figure 5 shows that even triangle-free unit segment graphs have unbounded twin-width.Indeed it shows how to represent any graph of B n with axis-parallel triangle-free unit segments, where B n is defined analogously to B n but the sets A, B, C induce now independent sets, and not cliques.The same argument establishes that the growth of B n is not the one of a small class.
Let us say that a class C is t-bounded if there is a function f C such that every K t -free graph G of C have twin-width at most f C (t).The previous remark shows that there are classes that are χ-bounded but not t-bounded, since unit segment graphs are χ-bounded [28].In a subsequent paper [5], we show that classes of bounded twin-width are χ-bounded.This implies in particular that every t-bounded class is χ-bounded, hence the set of t-bounded classes is a proper subset of the set of χ-bounded classes.One can check that the resulting trigraph does not depend on the order in which the pairs are contracted.Thus instead of the contraction of a sequence of pairs, we may as well speak of the parallel contraction of a set of disjoint pairs.A sequence of parallel d-contractions, or parallel and G i−1 is obtained from G i by a parallel contraction (of disjoint pairs of vertices).It is noteworthy that the existence of a parallel contraction sequence is equivalent to the existence of a (regular) contraction sequence, up to a multiplicative factor in the red degree.

Proposition 22. Let G be a trigraph, and d ∈ N.
If G admits a d-sequence, then G also admits a parallel d-sequence.
If G admits a parallel d-sequence, then G also admits a (2d + 1)-sequence.
Proof.The first item is clear since parallel contractions generalize mere contractions.We now show the second item.Let G and G be d-trigraphs, with G obtained from G by the parallel contraction of {a 1 , b 1 }, . . ., {a , b }.This parallel contraction can be sequentialized as G 0 , . . ., G where G 0 = G, and G i is obtained from G i−1 by contracting {a i , b i } into ab i , so that G = G .We claim that G i is a (2d + 1)-trigraph for any i ∈ [0, ].
Consider x ∈ V (G i ), and let B R Gi (x) ⊆ V (G i ) be composed of x and all its red neighbors in G i .There is a natural embedding e : V (G i ) → V (G ) through contraction, namely e(a j ) = e(b j ) = ab j for i < j , and e(x) = x for any other vertex.By definition of trigraph contractions, if xy is a red edge in G i , then either e(x) = e(y), or e(x)e(y) is a red edge in G .Hence e B R Gi (x) ⊆ B R G (e(x)).Furthermore, because e corresponds to the contraction of disjoint pairs, any Hence the red degree of Thus if G and G are d-trigraphs and G is obtained from G by a parallel contraction, then any sequentialization of the parallel contraction produces a sequence of (2d + 1)-trigraphs.Applying this result to every step of a parallel d-sequence yields a 2d + 1-sequence.
Our main result on parallel contraction sequences is that one can always find a parallel sequence of logarithmic length, at the cost of an increase in the red degree.This is a variant of the versatile twin-width theorem presented in Section 3 (Lemma 20).The difference with Lemma 20 is that we want to prove that the parallel contraction of these pairs of vertices is D-correct.Nonetheless, the arguments remain the same.For any i, since γ i = γ i , the contraction of (a i , b i ) can be done by simply deleting γ i .This yields a neat division of the contracted graph.Lemma 19 readily generalizes to parallel contractions, hence this new division is still in M •,d .By Lemma 14, the red number of this new division is at most D. This in turn bounds the red degree of the contracted graph (since mixed zones are filled with r entries).Hence the parallel D-contraction preserves the membership to M •,d .Applying repeatedly Lemmas 18 and 19 gives a sequence of parallel D-contractions until reaching a graph of size n with n < s < D, at which point the D-sequence can be completed in any way.
This gives a parallel We now use these short parallel contraction sequences to design adjacency labeling schemes for bounded twin-width graphs.
4. for any distinct x, y, z ∈ V (G), if A( (x), (y)) = r i and A( (x), (z)) = r j , then i = j.Note that we do not require A to be symmetric: one may have A(w 1 , w 2 ) = r j and A(w 2 , w 1 ) = r j with j = j .In particular, condition 4 need not properly d-color the red edges.
Proof.We proceed by induction on the length of the parallel d-sequence.The base case G = K 1 is trivial, with the unique label being empty.
Let G be a trigraph, and let G be obtained from G by parallel contraction of the pairs {a 1 , b 1 }, . . ., {a h , b h }.By induction, let us consider a labeling : V (G ) → {0, 1} * for G satisfying conditions 2 to 4. Before defining a labeling on G, let us introduce some notations.For i ∈ [h], let ab i ∈ V (G ) be the vertex obtained from the contraction of {a i , b i }.We define two partial functions p 0 , p 1 : V (G ) → V (G), corresponding to the predecessors with respect to contraction: Note that any y ∈ V (G) can be uniquely written as p c (x) for some x ∈ V (G ) and c ∈ {0, 1}.Next, for x ∈ V (G ) and j ∈ [d], let us define the j-th red neighbor of x, denoted by n rj (x).By condition 4, there can be at most one y ∈ V (G ) \ {x} such that A( (x), (y)) = r j .We define n rj (x) to be this unique y if it exists, and to be undefined otherwise.
Finally for a trigraph H and any two distinct vertices x, y ∈ V (H), the color of xy is We can now define the labeling : V (G) → {0, 1} * .Given y ∈ V (G), let c ∈ {0, 1}, x ∈ V (G ) be such that y = p c (x).Then, (y) consists of the following fields: The fields 3 and 4 call partial functions (namely p 0 , p 1 , and n rj ).If any of these functions is undefined on the relevant values, we use the convention to set the color to 0 (1 would also be acceptable, but r must be avoided).
Let us now explain how A is defined to inductively decode these labels.Note first that fields 2 to 4 have fixed size.Thus distinguishing the different fields is not an issue.Let y 1 , y 2 ∈ V (G) be two distinct vertices, with y 1 = p c1 (x 1 ) and y 2 = p c2 (x 2 ).As a first step, we want to retrieve col G (y 1 , y 2 ) from (y 1 ), (y 2 ).There are several cases.

Otherwise, if col
) by definition of a trigraph contraction.Furthermore we can compute col G (x 1 , x 2 ) from (x 1 ), (x 2 ) since correctly encodes the colors in G (condition 3).Otherwise, we have col G (x 1 , x 2 ) = r.Then let j ∈ [d] be such that A( (x 1 ), (x 2 )) = r j .By definition of n rj , we have x 2 = n rj (x 1 ), hence is given in field 4 of (y 1 ).The position of this information in field 4 is given by j (obtained from (x 1 ), (x 2 ) via A) and c 2 (field 2 in (y 2 )).
As a second step, when col(y 1 , y 2 ) = r, we need to define the numbered red label r j such that A( (y 1 ), (y 2 )) = r j , with j unique among the red edges incident to y 1 .Here we use the fact that all the red edges incident to y 1 appear in fields 3 and 4 of (y 1 ).Thus, given (y 1 ), we can enumerate the red edges incident to y 1 , and we fix the numbers on red labels according to this enumeration order.Since G has red degree at most d by hypothesis, labels r 1 , . . ., r d are sufficient (here, it is important that the color of "undefined" fields avoids r).Therefore conditions 3 and 4 are maintained.
The equality (y 1 ) = (y 2 ) implies that c 1 = c 2 , since their field 2 should match, and that x 1 = x 2 , as is injective.Thus it implies that y 1 = p c1 (x 1 ) = p c2 (x 2 ) = y 2 , hence is injective.Finally, let us analyze the size of the labels.Field 2 uses 1 bit.Fields 3 and 4 contain 2d + 1 colors, with 3 possible values.This can be encoded on (2d + 1) log 3 bits.Thus, the label sizes for increase by exactly 1 + (2d + 1) log 3 compared to , and condition 1 is preserved.
From Lemmas 23 and 24, we immediately conclude the following.
Theorem 25.The class of graphs with twin-width at most d admits a g(d) log n-bits adjacency labeling scheme, where n is the number of vertices and g is a double-exponential function.
The labeling scheme can in particular be used to encode an n-vertex graph of twin-width at most d on 2 2 γ(d+1) n log n bits, for some constant γ.This offers a significant compression over adjacency lists, since cliques for instance have twin-width 0. Now if the aim is only to globally compress the whole graph, and not to balance the lengths of the vertex labels, there is a simpler encoding with a better dependency in d.It basically consists of "reading" the d-sequence G = G n , . . ., G 1 = K 1 backwards.The encoding of K 1 is an identifier on log n bits.Then to go from G i to G i+1 , we write 3 log n +2 bits corresponding to the "split vertex" w, in which two vertices u, v vertex w is split, and whether there is a non-edge, a black edge, or a red edge between u and v, followed by d( log n + 4) bits corresponding to the edges between u, v and the at most d vertices adjacent to w in the red graph of G i .The latter part is carried by writing down the identifier of each red neighbor z of w followed by two pairs of bits encoding if there is a non-edge, a black edge, or a red edge between u and z, and between v and z.This permits to reconstruct G, and store it on only (d + 3)n log n + (4d + 2)n bits.

Expanders with bounded twin-width
A 2-lift of a graph G is a graph G on twice as many vertices, built by duplicating every vertex v ∈ V (G) into two copies, say, v 1 and v 2 , and for every edge vw ∈ E(G), adding to E(G ) either the edges v 1 w 1 and v 2 w 2 (parallel) or the edges v 1 w 2 and v 2 w 1 (crossing).The choice, for each edge of G, of having two parallel edges or two crossing edges is called the signing of the edges.See Figure 6 for an example of a 2-lift.Observe that G has 2 |E(G)| possible 2-lifts or signings.For instance, the all-parallel signing gives two disjoint copies of G, while the all-crossing signing gives the bipartite adjacency graph of G.
Figure 6 An example of a 2-lift of K4.
For n a power of 2, performing a sequence of log n − 2 randomly-signed 2-lifts starting on K 4 yields an n-vertex expander almost surely [3].Observe that the obtained graph is necessary cubic since the 2-lift operation preserves the degree.Bilu and Linial [3] even exhibit a deterministic polytime procedure to actually find the signings leading from K 4 to a cubic expander.The next result shows that cubic expanders can have bounded twin-width.Lemma 26.Every graph obtained from K 4 by performing a sequence of 2-lifts has twinwidth at most 6.
Proof.We show that if G is a cubic graph and G is a 2-lift of G, then G can be obtained from G by a sequence of contractions in which the maximum degree never goes above 6.It is enough to conclude since K 4 is obviously 6-collapsible, and we can assume that the cubic trigraph we start from has all its edges red.
Let v 1 , v 2 , . . ., v n be the vertices of G, and v i 1 , v i 2 be the duplicates of v i in G .For each i running from 1 to n, we contract v i 1 and v i 2 .By definition of a 2-lift, after these n contractions, the graph obtained is G.We contracted disjoint pairs of vertices of degree 3, so we could not create vertices of degree more than 6.This surprising result teaches us the following lessons.First, bounded twin-width appears more general than expected.Also, by Theorem 4, not only there are some expanders with bounded twin-width but there are some FO transductions of expanders with that property.Second, it tells us that even among bounded-degree graphs, bounded twin-width is a novel class.Indeed bounded twin-width could have coincided with polynomial expansion within the class of bounded-degree graphs.Now we know that it is not the case.There are cubic graphs with bounded twin-width but no strongly sublinear (i.e., of size at most n 1−ε for some ε > 0) balanced separators.Expanders have treewidth Θ(n) and therefore no strongly sublinear balanced separators, the latter being equivalent to polynomial expansion [26,14].
The third lesson is that designing good approximation algorithms in bounded twin-width classes promises to be challenging.It is perfectly fitting and propitious to ask for other algorithmic applications of twin-width.Before we understand enough to approximate in general bounded twin-width classes, an interesting first step is to approximate optimization problems such as Maximum Independent Set (MIS for short) on graphs with bounded degree and twin-width.MIS is APX-hard in general cubic graphs, so we may ask for a polynomial-time approximation scheme (PTAS) when we add the condition of bounded twin-width.A natural approach for that would be to show that these graphs have strongly sublinear balanced separators (this is how PTASes are obtained for planar, H-minor free graphs, etc.).This approach is now ruled out.Therefore, if MIS indeed admits a PTAS in bounded twin-width cubic graphs, this cannot be directly based on small balanced separators.The simplest toy-problem in that direction is to explore PTASes for iterated 2-lifts of K 4 .

Subdivisions of cliques
For any non-negative integer k, the k-subdivision of a graph G, denoted by G (k) , is the graph obtained by subdividing every edge of G exactly k times.For any f : N → N, let G f be the class formed by the f (|V (G)|)-subdivision of every graph G.
Theorem 27.For every positive and non-decreasing f , G f has bounded twin-width if and only if f (n) = Ω(log n).
Let us first observe that for any integer k > 0 and n-vertex graph G, G (k) is an induced subgraph of K (k) n .Thus the class G f is contained in the hereditary closure of the graphs for n 0. Since twin-width never increases when taking induced subgraphs, it suffices to consider graphs of the form K . As hinted at in Section 2.3, the forward implication of Theorem 27 could be derived from Theorem 21 and the fact that o(log n)-subdivisions does not form a small class.We give a direct proof of a stronger statement.Proof.Let G be K (k) n , for some positive integer k.Assuming that G has twin-width at most d, we show that k log d+1 (n − 1) − 1.Note that the assumption k > 0 is required because K (0) n = K n has twin-width 0. In a d-contraction sequence of G, let us consider the first step in which two vertices x, y of the original K n are contracted.Let P the partition of V (G) at this step, and P 0 ∈ P the part containing x and y.In G, consider the n − 1 paths, on k + 1 edges each, resulting from the subdivided edges starting at x.We partition the vertices of these paths as V 1 , . . ., V k+1 , where V i contains all the vertices at distance i of x.Then V k+1 contains all the vertices of the original K n except x.In particular, no two vertices of V k+1 are in the same part of P.
All the vertices of V 1 are neighbors of x but not of y, thus for any part P ∈ P \ {P 0 } intersecting V 1 , P P 0 is a red edge in G P .Thus at most d + 1 parts of P intersect V 1 , and there exists P 1 ∈ P such that . Observe that P 1 may well be equal to P 0 .Similarly the vertices in , of size at least n−1 d+1 , is split in at most d + 1 parts in P. Thus there is a part P 2 ∈ P (that may be P 0 or P 1 ) which contains at least n−1 (d+1) 2 vertices of V 2 .It follows by induction that for every i ∈ [k + 1], there exists a part of P containing at least n−1 (d+1) i vertices of V i .However no part of P contains more than one vertex of V k+1 .Hence The converses relies on some results on decompositions of permutations.We now encode a permutation σ in the usual way, as the sparse matrix with entry 1 at position (i, σ(i)), and 0 elsewhere.(This is unlike the more cumbersome but technically-motivated dense encodings used in Section 3.6 and [6, Section 6.1].) A permutation σ is a t-merge if its domain can be partitioned into t possibly-empty discrete intervals I 1 , . . ., I t such that the restriction of σ to I i is increasing.Merging t sorted lists can be expressed as the application of some well chosen t-merge to the concatenation of the lists.A permutation σ is a parallel t-merge if its domain can be partitioned into an arbitrary number of intervals J 1 , . . ., J r such that σ operates independently on each J i (i.e., σ(J i ) = J i ), and the restriction σ |Ji is a t-merge.See Figure 7 for an example of a parallel 2-merge.Lemma 29.For any t, ∈ N, any permutation on t elements can be decomposed as a product of at most parallel t-merges.
Proof.Let σ a parallel t-merge, with its domain partitioned into intervals J 1 , . . ., J r such that σ(J i ) = J i , and every σ |Ji is a t-merge.Assume for a contradiction that σ contains a (t + 1)-grid.Then it contains a decreasing subsequence of length t + 1.
For any i < j, x ∈ J i and y ∈ J j , one has x < y and σ(x) < σ(y) because J i , J j are disjoint intervals, with σ(J i ) = J i and σ(J j ) = J j .It follows that any decreasing subsequence is contained entirely in one of the J k .Thus, there exist a t-merge σ |J k which contains a decreasing subsequence of length t + 1.
Since σ |J k is a t-merge, J k is itself partitioned into intervals I 1 , . . ., I t such that σ is increasing on I i .Hence each I i can contain at most one element of a decreasing subsequence, and σ |J k contains no decreasing subsequence of length more than t, a contradiction.n .We want to order V (G) such that the adjacency matrix of G in that order is r-grid free, for some r depending only on c.This implies the desired twin-width bound by Theorem 2.
Choose an arbitrary orientation of the edges of K n .In G, the edges of K n become directed paths on k + 1 edges.Then, for 0 i k, let V i ⊂ V (G) contain every i-th vertex along these directed paths.In particular, V 0 corresponds to the vertices of K n , while V 1 , . . ., V k are all the vertices created by the subdivision.Thus, V 0 , . . ., V k is a partition of V (G).
Let us now define an order within each V i .Choose x 1 , . . ., x n an arbitrary order on V 0 .The extremal set V 1 is ordered according to the neighbors in V 0 , i.e., with first the neighbors of x 1 in any order, then the neighbors of x 2 , etc.We proceed similarly for V k .The disjoint paths in G − V 0 define a bijection between V 1 and V k , which can be interpreted as a permutation σ on n(n−1) 2 elements according to the previous orderings.Then, choosing orderings for V 2 , . . ., V k−1 is equivalent to decomposing σ as a product σ By Lemma 29, we may choose σ 1 , . . ., σ k−1 to be parallel t-merges for any t such that . This is satisfied by t = 2 2c , which crucially is independent of n.With this choice of decomposition for σ, we have ordered where V i is ordered as previously defined.
Let M be the adjacency matrix of G respecting this ordering.Let R 0 , . . ., R k (resp.C 0 , . . ., C k ) the partition of the rows (resp.columns) of M induced by the partition V 0 , . . ., V k of V (G).Then (R, C) = ({R 0 , . . ., R k }, {C 0 , . . ., C k }) is a division of M .For i, j ∈ [0, k], let M i,j be the zone R i ∩ C j , which corresponds to the adjacency matrix between V i and V j .The zone M i,j is non-zero if and only if i = j ± 1 modulo k + 1.Thus, there are 2k + 2 non-zero zones, forming a double diagonal with corners (see Figure 9).Claim 32.Every zone of the division (R, C) of M is (t + 1)-grid free.
Proof.For 1 i < k, the zones M i,i+1 and M i+1,i are parallel t-merges or transposes thereof, hence are t + 1-grid free by Lemma 30.The zones M 0,1 , M 1,0 , M 0,k , and M k,0 are composed of a single monotone sequence, hence are 2-grid free.
Claim 33.There is a set A ⊂ C of at most 5 column-parts such that every The above implies that any C ∈ C must intersect some C ∈ A. Finally we have i i , which implies |A| 5. Of course, claims 33 and 34 still hold when inverting the roles of rows and columns.Thus, there are R ∈ R, C ∈ C such that R (resp.C) contains at least −10 5 parts of R (resp C ) as subsets.Hence the zone R ∩ C contains an −10 5 -grid induced by the corresponding parts of R and C .By Claim 32, it follows that −10 5 t, or 5t + 10.Recall that t was chosen as t = 2 2c .Hence we have proved that M is g(c)-grid free for g(c) = 5 2 2c + 11.
A fortiori M is g(c)-mixed free, and by Theorem 2 the twin-width of G is at most f (c) for some f (c) double-exponential in g(c), hence triple-exponential in c.
In the next section, we will show that graphs with queue number t have twin-width 2 2 O(t) (see Theorem 38).This can be used to get an alternative proof to Proposition 31, albeit not self-contained.Indeed it was shown that the 2 log d n/2 + 1-subdivision of K n (see [13,Theorem 4]) has queue number at most d.

Sparse twin-width
We start this section by showing the list of equivalences of Theorem 12.
structures with a constant number of binary relations).To specify the transduction, we explain how every fixed r-shallow minor H is obtained.Let G ∈ B be an edge-bicolored graph containing H as a spanning and induced r-shallow minor, where each contracted set induces a tree in G.More precisely, the colors on E(G) are such that every "edge of H" is colored 2, while every contracted edge (that is, other edge) is colored , where E 1 is the edge set colored 1, and E 2 is the edge set colored 2. The edge interpretation φ(x, y) links two vertices u, v if they are "reference vertices" for their contracted set, and there is an edge colored 2 between two vertices u , v where there is a path of edges colored 1 of length at most 2r between u and u , and between v and v .Such paths always exist within a contracted set since the radius is at most r, hence the diameter is at most 2r.Finally the graph obtained by this (U, φ)-interpretation is exactly H.
We now want to bound the average degree of the r-shallow minors in ∇ r (C) by some value f (r).Since ∇ r (C) is subgraph-closed (every subgraph of an r-shallow minor is an r-shallow minor), Sub(∇ r (C)) = ∇ r (C) has bounded twin-width.Thus (iv) implies (iii) for the class ∇ r (C).Therefore ∇ r (C) has bounded average degree, and C has bounded expansion.
In the previous proof the heredity of C is only used to show that (iii) implies (i).It is not an artifact of the proof since {K t,t t 2 K 1 } t∈N is a class of bounded twin-width where all graphs have linearly many edges, but admits arbitrary large bicliques.The equivalences (i) ⇔ (ii) ⇔ (iv) ⇔ (v) hold for every (possibly non-hereditary) class of bounded twinwidth.Bounded sparse twin-width classes remain surprisingly diverse.They for instance contain K t -minor free graphs and bounded-degree bounded twin-width graphs, which in turn contain some expander classes.In particular bounded sparse twin-width graphs do not have polynomial expansion.

Flat classes
For any graph invariant ι, we say that a class C is ι flat if it is included in Sub(G H) with G and H two classes of bounded ι, and H also has bounded degree.Recalling the definition in Section 2, a class is flat if it is treewidth flat.We will see that twin-width flat classes have bounded twin-width.It will imply that (treewidth) flat classes are other examples of bounded sparse twin-width classes.
We say that G is a trigraph over a graph H if (V (G), E(G) ∪ R(G)) is isomorphic to H. Thus G is obtained from the graph H by coloring red some of its edges.More generally G is a trigraph over a trigraph H if there is a graph isomorphism from (V (G), E(G) ∪ R(G)) to (V (H), E(H) ∪ R(H)) such that every black edge of G is mapped to a black edge of H. Again G is obtained from the trigraph H by coloring red some of its black edges.We start by bounding the twin-width of trigraphs over graphs with bounded degree and bounded twin-width.

Lemma 35. Every trigraph over a graph H has twin-width at most tww(H) + ∆(H).
Proof.Consider a tww(H)-sequence of H.A simple but important observation is that the black degree of a vertex never increases in a contraction sequence.Thus each trigraph of the sequence has total degree at most ∆(H) + tww(H).Therefore, when the same sequence is applied to any trigraph over H, the overall maximum (red) degree is also bounded by ∆(H) + tww(H).
We can now show the following.First we contract G H to a trigraph over H by a sequence containing as intermediate steps trigraphs over G n H, G n−1 H, • • • , G 1 H. Say G i is obtained from G i+1 by contracting u, v ∈ V (G i+1 ), into vertex w, then the part of the d-sequence from a trigraph over G i+1 H to one over G i H consists of contracting, in any order, the vertices (u, j) and (v, j), into vertex (w, j), for every j ∈ [h].As the red degree of w ∈ V (G i ) is at most d G , vertex (w, j) has red degree at most d G (∆ + 1) + 2∆.This is because the j-th copy of G is linked to the j -th copy only if j ∈ N H [j].This explains the d G (∆ + 1) term.The additional 2∆ accounts for possible red edges between (w, j) and ( , j ), where ∈ {u, v, w} and j = j.
We can now finish the d-sequence from the obtained trigraph over K 1 H, which is isomorphic to H, using the d H -sequence of H. Indeed by Lemma 35 this trigraph admits a d H + ∆-sequence.
Proof.By Theorem 9, G H has twin-width bounded by a function of tww(G), tww(H), and ∆(H).The implication (i) ⇒ (iv) (via (ii)) in Theorem 12 does not require that the bounded twin-width class C is hereditary.Thus, G H being K t,t -free, the subgraph closure Sub(G H) has twin-width bounded by a function of tww(G), tww(H), ∆(H), and t.
Lemma 36.If G is K t,t -free, then G H is K s,s -free where s = 2t(∆(H) + 1).
Proof.Assume, for the sake of contradiction, that there exist disjoint vertex sets A, B ⊆ V (G H) such that |A| = |B| = 2t(∆(H) + 1) and A, B are fully adjacent.Let V (H) := [h] and, let a ∈ A be the vertex (v, j) for some v ∈ V (G) and j ∈ V (H).Since j is adjacent with at most ∆(H) vertices of H, and (u, i) is adjacent with (v, j) only if i = j or ij ∈ E(H), B is contained in the union of at most ∆(H) + 1 copies of G.This means that there exists some j * ∈ [h] such that the j * -th copy of G contains a set B of at least 2t vertices of B. Likewise, there is an i * ∈ [h] such that the i * -th copy of G contains a set A of at least 2t of A. Let A ⊆ A and B ⊆ B be vertex sets of size t such that the first coordinates of the vertices in A ∪ B are pairwise distinct.Then the vertex subset of V (G) which appears as the first coordinates in A ∪ B form a K t,t , a contradiction.Theorem 10 and Lemma 36 imply that flat classes have bounded twin-width, since bounded treewidth classes have sparse bounded twin-width (they are K t,t -free and have bounded twin-width).In particular, it provides an alternative proof that planar graphs have bounded twin-width (see [6,Section 6]).The obtained bound remains bad since we still need to use Theorem 2 to justify that the subgraph closure of a K t,t -free bounded twin-width class has bounded twin-width.
Lemma 42.If S and S are two finite generating sets of the group Γ , then F (Γ, S) has bounded twin-width if and only if F (Γ, S ) has bounded twin-width.
Proof.Let us assume that F (Γ, S) has bounded twin-width.The first step is to show that a more general object has bounded twin-width.Namely, let us consider the oriented labeled Cayley graph OLCay(Γ, S) where every edge {x, x • s} is furthermore oriented from x to x • s and labeled by s.Note that the class OLF(Γ, S) of all finite induced restrictions of OLCay(Γ, S) is contained is the (more general) class C of all orientations of graphs of F (Γ, S) which are edge-labeled by S. The key fact is that C has bounded twin-width.Indeed, given any class of graphs G with degree at most d and twin-width at most t, the class G s consisting of {1, . . ., s} edge-labeled orientations of graphs of G also has bounded twin-width.To see this, let us consider an element O of G s which is an oriented edge-labeled graph G of G.We just have to show that we can interpret O in terms of G. To start with, we consider for G a linear order L G of its vertices, such that the adjacency matrix of G, ordered by L G , has twin-width at most f (t).When closed under induced restrictions, the class of birelations (G, L G ) has bounded twin-width.Since the order L G provides for every vertex an order on its incident edges, we can furthermore label the vertices of (G, L G ) using 2 d colors in order to code for every vertex v how the (at most) d edges incident to it are oriented.Therefore the class of orientations of G can be interpreted from the class of (G, L G ) vertex-labeled by 2 d colors, and thus has bounded twin-width.For the edge-labeled version, we just have to label the vertices with 2 d |S| d colors.To conclude the proof, we observe that since every generator s ∈ S can be expressed with S, the class OLF(Γ, S ) is contained in an FO transduction of OLF(Γ, S).Therefore, by Theorem 4, OLF(Γ, S ), and thus F (Γ, S ), has bounded twin-width.
Therefore, if the small conjecture does not hold, the class of finitely generated groups splits into bounded twin-width groups and unbounded twin-width groups.This could reflect a known dichotomy for groups.A natural candidate for a finitely generated group of unbounded twin-width, would be a group with no finite presentation.For instance the lamplighter group is an interesting test case, but its associated class of graphs has indeed bounded twin-width.A first step towards Conjecture 39 is to show that finitely presented groups have bounded twin-width.
For S ⊆ V (G), we denote the open neighborhood (or simply neighborhood) of S by N G (S), i.e., the set of neighbors of S deprived of S, and the closed neighborhood of S by N G [S], i.e., the set N G (S) ∪ S. We simplify N G ({v}) into N G (v), and N G [{v}] into N G [v].We denote by G[S] the subgraph of G induced by S, and G − S := G[V (G) \ S].For two disjoint sets A, B ⊆ V (G), E(A, B) denotes the set of edges in E(G) with one endpoint in A and the other one in B. Two distinct vertices u, v such that N (u) = N (v) are called false twins, and true twins if N [u] = N [v].Two vertices are twins if they are false twins or true twins.For two vertices u H ∈ H}, where G and H are two sets of graphs.Given a class C, we denote by Sub(C) the class of all subgraphs of members of C. The class Sub(C) is by definition subgraph-closed, and is called the subgraph closure of C. Similarly the hereditary closure of a class C consists of all the induced subgraphs of members of C, and is hereditary by design.

Figure 1
Figure 1 Contraction of vertices u and v, and how the edges of the trigraph are updated.

Figure 2
Figure2To the left, a neat division: each zone is horizontal, or vertical, or full with r entries (mixed zone).Note that the division is symmetric but not the matrix.To the right, in bold, a 3-mixed minor of the neat division.Observe that it coarsens the neat division and contains in each of its 9 zones either a 0,1-corner or a mixed zone (framed by red dashed boxes).

Definition 13 .
Let M n,d be the class of the neatly divided n × n symmetric 0, 1, r-matrices (M, (R, C)), such that (R, C) is symmetric and has: mixed value at most 4c d , part size at most 2 4c d +2 , and no d-mixed minor.

Lemma 16 .R 2 R 3 ∪ R 4 R 5 6 R 2 ∪ R 3 R 4 ∪ R 5 CFigure 3
Figure 3 Illustration of the injection from the mixed zones and cuts of C on (M , (R, C )) to the mixed zones of C on (M , (R1, C )) and (M , (R2, C )).The mixed cuts are represented by red solid lines, and an arbitrary choice of an overlapping 0, 1-corner.

Lemma 17 .
Let (M , (R, C )) be the coarsening of a neatly divided matrix (M, (R, C)) resulting from the fusion of a single pair of consecutive parts C, C ∈ C, with C ∪ C = C * .Then for every part R ∈ R, the mixed value of R on (M , (R, C )) is at most the mixed value of R on (M, (R, C)).Proof.Again this symmetrically works if we switch the role of R and C. (The proof of that statement follows as in [6, Lemma 11].)If the zone R ∩ C * is not mixed in (M , (R, C )), then the mixed value of R has not changed after the fusion of C and C .If, on the contrary, the zone R ∩ C * is mixed in (M , (R, C )), then at least one of the three following propositions holds

Lemma 19 .
Let (M, (R, C)) ∈ M n,d be a neatly divided matrix with two equal rows ρ, ρ in a part R ∈ R, hence symmetrically two equal columns γ, γ in a part C ∈ C. Then removing row ρ and the symmetric column γ yields a neatly divided matrix of M n−1,d .Proof.By design the new matrix and division are symmetric.The new neatly divided matrix remains d-mixed free.The part size can only decrease, as well as the mixed value.

Theorem 21 .
There is a triple-exponential function f : N → N such that the number of n-vertex trigraphs with twin-width at most d is at most n!f (d) n .Proof.Let G = (V = [n], E, R) be a trigraph with twin-width at most d.By Lemma 20, G has versatile twin-width at most D := 4c 2d+1 2 4c 2d+2 +2 , and admits a versatile tree of D-contractions.We now say that a vertex u is D-good if there is another vertex v such that the contraction of u and v is D-correct.The versatile tree of D-contractions offers n/s of D-good vertices, with s := 8 • 2 4c 2d+2 +1 = 2 4(c 2d+2 +1) .Let I n,D be the class of trigraphs with twin-width at most D on vertex set [n] and L n,D the subset of I n,D consisting of trigraphs such that vertex n is D-good.Since n/s |I n,D | n|L n,D |, it holds that |I n,D | (s + 1)|L n,D |.
B, b 1 , b 2 , . . ., b n .Then we put an edge between a i and b j if and only if i < j.The half-graph between B and C is built similarly.We choose another order for the vertices of B, say, b 1 , b 2 , . . ., b n , and an order for C, c 1 , c 2 , . . ., c n .Then we put an edge between b i and c j if and only if i < j.It is important that the choice of the orders b 1 , . . ., b n and b 1 , . . ., b n are independent.

Figure 5 A
Figure5A representation of the graph of Figure4, where the cliques induced by A, B, C are replaced by independent sets, with axis-parallel triangle-free unit segments.The upside-down permutation matrix of σ = 41532 is still visible as the right endpoints of the red segments.

4
Short parallel d-sequences and adjacency labeling schemesEvery d-contraction sequence of an n-vertex graph has length exactly n − 1, since each of its steps contracts exactly one pair of vertices.What if we allow parallel contractions where disjoint pairs of vertices may be contracted in a single step?In this section we adapt the results of Section 3 on versatile twin-width to prove the existence of parallel contraction sequences of logarithmic length.We then use them to provide an f (d) log n-adjacency labeling scheme for graphs of twin-width at most d.A parallel contraction in a trigraph G consists of the successive contractions of any number of pairs of vertices {a 1 , b 1 }, . . ., {a , b }, where a 1 , . . ., a , b 1 , . . ., b are all distinct.

Lemma 23 .
Any n-vertex graph G with twin-width at most d admits a parallel D-sequence of length O(s • log n) where s, D are double exponential functions of d.Proof.The proof is very similar to the one of Lemma 20.Let G be an n-vertex graph with twin-width at most d, and let A be a d-twin-ordered adjacency matrix of G.By Theorem 2, A is 2d + 2-mixed free.We set d := 2d + 2, := 2 4c d +1 , s := 8 and D := 4c d • 2 .Applying Lemma 18 to the finest division of A yields a coarsening (A , (R , C )) ∈ M n,d with n/s disjoint pairs of identical columns (γ 1 , γ 1 ), . . ., (γ n/s , γ n/s ), corresponding to pairs of vertices (a 1 , b 1 ), . . ., (a n/s , b n/s ).

Figure 7 A
Figure7A parallel 2-merge matrix, corresponding to the permutation 23514687.Note that the first row is at the bottom, as is common with permutation matrices.It is composed of three blocks, each of which can be partitioned in two increasing subsequences, indicated by the dashes.Empty areas are filled with 0.

Figure 8
Figure 8Left: the permutation τ to sort.Center: the 2-merge permutation σ to use on τ .Right: the composition σ −1 • τ , one may inductively sort the two blocks by applying further parallel 2-merges.

Proposition 31 .
For any c > 0, the class of cliques K n subdivided at least log n c times has twin-width at most f (c) for some triple-exponential function f .Proof.Let k log n c , and let G be K (k)

Figure 9 10 5
Figure 9The adjacency matrix M of G, with the appropriate ordering of the vertices.
Let us first give a brief sketch of Norine et al.'s proof, which works by induction on n.They say that a vertex is d-good if it has degree at most d and either has a twin or has a neighbor with degree at most d.They show the following technical lemma: K t -minor free n-vertex graphs have at least n/d d-good vertices, for some d function of t only.Let I n,t be the set of K t -minor free graphs on [n], and K n,t be the subset of all those graphs of I n,t where vertex n is d-good.By their lemma n/d • |I n,t | n|K n,t |, hence |I n,t | d|K n,t |.Furthermore, any graph of K n,t admits an index i ∈ [n − 1] such that either i and n are false twins, or i and n are adjacent and have at most d − 1 other neighbors each.Therefore any G ∈ K n,t can be obtained from a G ∈ I n−1,t and i ∈ [n − 1] by either introducing a new vertex n false twin of i (one graph), or by splitting i into i and a new vertex n adjacent to i, and by distributing in G the at most 2(d − 1) neighbors of i in G into: neighbors of i only, neighbors of n only, and common neighbors (at most 3 2(d−1) graphs).Hence |I n,t 1. Let us recall that an FO transduction consists of adding a set of O(1) non-deterministic unary relations (or coloring of the vertices with O(1) colors), defining the new vertices and edges by means of FO formulas, and deleting all colors and potentially some vertices.Here we only need one unary relation, say, U , and we focus on such a coloring where U (v) holds for exactly one vertex v in every contracted set.The new vertices are simply defined by the formula U (x).Then we can define the edges by the formula φ(x, y) = U (x) ∧ U (y) ∧ ∃x ∃y d 2r 1 (x, x ) ∧ d 2r 1 (y, y ) ∧ E 2 (x , y ), with d 2r 1