User Tools

Site Tools


public:de_bruijn

de Bruijn sequences

We have described generalized de Bruijn sequences for neuroscience experiments, and introduced a novel form (“path-guided”) specialized for neuro-vascular (e.g., BOLD fMRI) experiments:

These sequences are particularly well suited for carry-over designs, although are of utility in any neuroscience experiment that examines the relationship between sequentially presented stimuli and neural response. The path-guiding approach is useful for any application in which a systematic modulation of path length is to be encoded in a pseudo-random sequence.

In this wiki page, we provide a basic introduction to de Bruijn sequences. The link below leads to a page containing software and instructions for creating your own sequences.

Motivation

A hash function is a central computational and cryptographic technique in which variable length data are mapped to a fixed length variable. One application of this in cryptography is a “checksum” operation, which boils down an arbitrary message into a small, fixed-length variable called a hash code. A given message can only result in one valid hash code, so tampering with the message can be discovered by detecting a change in the hash.

Our key insight is that, effectively, the brain implements a temporal hash function upon the world. The complex, variable length visual experience we have is hashed down to be represented in a fixed set of neurons. Neuroscience is about cracking the hash function. De Bruijn sequences are essentially a brute-force cryptographic attack on the brain's hash function. The sequences let us replay history (in the limited, experimental context of the stimuli we study), try every possible variant, and figure out the hash algorithm.

Background

nicolaas_govert_de_bruijn.jpeg De Bruijn sequences are a cyclic ordering of k-labels of order-n such that travel through the sequence yields every possible sub-sequence of n consecutive labels. Equivalent statements of this property are that de Bruijn sequences contain every “word” of length n in the alphabet of k labels, or that de Bruijn sequences are counterbalanced at the level of n. The sequence has a length of kn.

The number of unique de Bruijn sequences for a given k and n is given by:

which is clearly vast for even modest values of k and n. Note that any one of these unique sequences may further be used to create k! derivative sequences by permuting the assignment of labels in the alphabet.

Cyclic sequences of this kind are a subject of study in discrete mathematics and have been discovered independently many times. Nicolaas Govert de Bruijn (pronounced roughly “duh-BROUN”), working with Tanja van Ardenne-Ehrenfest, formalized the sequences for k>2. After a long life and a productive career, NG de Bruijn died on February 17, 2012.

Generation of de Bruijn sequences

De Bruijn sequences may be represented, and thus constructed, as a closed path that visits every vertex of a de Bruijn graph: a network of kn connected nodes in which each node is one of the length n words that may be assembled from the k labels. One vertex has a directed edge to another if the word obtained by deleting the first symbol of the former word is the same as the word obtained by deleting the last symbol of the latter word. The de Bruijn cycle is given by a Hamiltonian cycle through the directed graph, which is a path that visits each vertex once and returns to the starting vertex.

Finding Hamiltonian circuits is an NP-complete problem, and thus generally solved by brute-force search algorithms. One approach begins at a random node in the graph and then selects randomly amongst the available neighboring nodes to trace a path. If a dead-end is reached from which there is no available next node, a “backtracking” process is used to re-route. This search approach is guaranteed to produce a valid Hamiltonian path (and thus a de Bruijn cycle) within a de Bruijn graph.

The figure (click to enlarge) shows a k=3, n=2 de Bruijn graph and cycle. For a set of three stimuli (or an alphabet of three letters), there are 9 pairings or transitions between the stimuli (or, 9, two-letter words). Each node (light blue) in the de Bruijn graph is a possible transition between stimuli,or two-letter word. The directed edges between the nodes indicate available sequential transitions between one word and the next, each of which shifts the word by one register and adds a new letter. The red lines indicate one of 24 possible Hamiltonian cycles through the graph. Below the graph is the necklace of nodes indicated by the red path. Below that is the de Bruijn cycle, derived from the last register of each bead on the necklace. Travel about the cycle yields every possible two-letter word (or pairing of stimuli) in the three-letter alphabet.

Path-Guided de Bruijn sequences

Given variation in the amplitude of expected responses for different stimuli or transitions, a sequence of stimuli may be created that will produce a modulation of neural activity within the band of temporal frequencies that are optimal for the imaging method. This is done by defining a guide function, which concentrates power within the desired band of temporal frequencies, and then biasing the path through the de Bruijn graph to follow that function. The guide function may be simply a sinusoid of the desired frequency, or the sum of sinusoids of random phases within a frequency range.

A path-guided sequence is generated by ranking the nodes that are available as the next point in the growing Hamiltonian path. The ranking favors nodes (and thus transitions) predicted to produce larger or smaller neural responses and this bias varies over the course of the sequence according to the guide function.

An equivalent description is that a sequence is generated in which a systematic variation in path-length in the Hamiltonian circuit is sought.

We have produced a command line (c code) executable which may be used to produce path guided de Bruijn sequences.

Specific forms of de Bruijn cycles

Specific forms of cyclic, counterbalanced sequences are contained within the larger class of de Bruijn sequences.

m-sequence

A modified (or “punctured”) de Bruijn sequence is created by removing one label (by convention, the label “zero”) from the single occurrence of the length n word composed of all zeros, yielding a sequence of length kn − 1. The set of Maximum Length Sequences (or M-Sequences) are contained within this set of punctured de Bruijn sequences.

In addition to the counter-balance properties of all de Bruijn sequences, m-sequences have the additional property of a nearly flat power spectrum (alternatively stated, the autocorrelation function of the sequence approaches a delta function).

Introduced for use in BOLD fMRI studies by Buracas and Boynton (2002), m-sequences provide ideal efficiency for estimating the shape of the hemodynamic transfer function, although relatively weak detection power. M-sequences are generated with maximal linear feedback shift registers, and exist only for alphabets in which k is the power of a prime number; there are no constraints upon n.

Type 1, Index 1 sequence

Type 1 Index 1 sequences, initially described by Finnery and Outhwaite in 1956 are n=2 counterbalanced sequences of k labels, for k > 5. T1I1 sequences have the property of presenting permuted blocks of the labels over the course of the sequence, with repetition of the labels at the boundaries of each block. They were introduced for use in fMRI by GK Aguirre (2007).

/home/aguirreg/html/wiki/data/pages/public/de_bruijn.txt · Last modified: 2013/10/21 21:17 by aguirreg