How To Draw Trellis Diagram For Convolutional Encoder

Turbo-Like Codes Constructions

Sergio Benedetto , ... Guido Montorsi , in Academic Press Library in Mobile and Wireless Communications, 2014

4.5.1 Computation of SISO updating with the minimal trellis

Any memoryless mapping, including, for example, all block encoders, can be conveniently represented over a suitably defined trellis diagram, exactly like a convolutional encoder. The main difference with respect to the trellis diagram of the convolutional encoder is that in this case the trellis diagram is time varying, i.e., the trellis sections change within the trellis diagram. Moreover, being the mapping memoryless, the set of starting states of the first trellis section and the set of ending states of the last trellis section have cardinality 1.

The trellis diagram corresponds to the mapping in the sense that all valid mapper configurations correspond to a path in the trellis diagram. Furthermore, the SISO algorithm described in Section 4.4 for the SISO module of the trellis encoder can be extended to work also on time-varying trellis diagrams by allowing the trellis sections $(s^{s} (e), u (e), c (e), s^{E} (e))$ to vary within the trellis diagram $(s_{i}^{s} (e), u_{i} (e), c_{i} (e), s_{i}^{E} (e))$ . The complexity of the SISO algorithm is proportional to the total number of edges in the trellis diagram. Thus, the complexity will be minimized by finding, among all possible trellis representations of the considered mapping, the one associated to the minimal number of edges (minimal trellis).

The procedure to find a minimal trellis diagram for a given mapping is not straightforward and is outside the scope of this book. The interested reader can refer to [36] for more details.

Here we consider as examples two very important minimal trellis representations used in practice, i.e., the one associated to a generic $(1, n)$ repetition code (Figure 37) and the one associated to its dual, the $(n - 1, n)$ parity-check code (Figure 38).

The trellis diagram of the repetition code is made up with $n$ trellis sections of three different types. The first trellis section (A) is associated to one input bit and one output bit, and the next $n - 1$ trellis sections (B) are associated to no input bits (1 edge per state) and one output bit. The final trellis section (C) has a single final state.

The total number of paths is 2, equal to the number of code words. The complexity, normalized to the number of sections, is:

$C = \frac{number of edges}{number of trellis sections} = 2,$

and is independent from the code size $n$ .

The trellis diagram of the parity-check code (Figure 38) is also made up with $n$ trellis sections of three different types. The first trellis section is associated as before to one input bit and one output bit, and the next $n - 1$ trellis sections now have four edges, and are associated to one input bit and one output bit. The final trellis section has a single final state and no input bit. ⁴ In this case the total number of trellis paths is $2^{n - 1}$ , still equal to the number of code words. The complexity, normalized to the number of input bits, is then:

$C \propto \frac{4 n}{n} = 4 .$

Also in this case it is independent from the code size, and much smaller than the complexity of the brute force computation of (29) and (30), which is proportional to the number of code words.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123964991000029

Channel Coding and Decoding

Carl Nassar , in Telecommunications Demystified, 2001

7.2 Channel Decoding

Let's now explore what the channel decoder does. First, we'll define the task of the channel decoder. In Figure 7.6, at the channel coder, a bit sequence m comes in; we'll consider the channel coder of Figure 7.2 with m = (0 0 0) coming in. This is mapped to the output, which we'll call u; in the example we're considering, u = (00 00 00). These six bits are sent across the channel by the modulator and returned to six bits by the demodulator. The bits that come out of the demodulator, which feed the channel decoder, will be referred to as v. These bits v may or may not be equal to u. An error may have occurred in transmission, resulting in v = u + e, where e represents the bit error. For the example we're considering, we'll say we receive v = u + e = (00 00 00) + (00 00 01) = (00 00 01). In this case, one bit error has occurred in transmission at position 6.

The goal of the channel decoder, then, is to take the bits v that it has received, and come up with the best guess at m, the original information bits. We'll call the channel decoder's guess at m the vector m′. We want to find a way to make m′ as close to m as possible.

7.2.1 Using a Trellis Diagram

How does the channel decoder use a "trellis diagram" to help it figure out how to put out an m′ that matches m? The trellis diagram of Figure 7.5(b) shows the possible output bits of the channel coder. If you look at the top of the dotted and dashed lines of the trellis diagram, you see those possible outputs.

Let's return to our example of Figure 7.6 and consider the transmitted output u = (00 00 00). We can use the trellis diagram to verify that this u is a possible output of the channel coder. Looking at the trellis diagram, we see that (00 00 00) is a possible output by following the chain of 00 outputs that sit above the solid lines at the top of the trellis diagram. What about the received input v = (00 00 01)? Is that a possible output of the channel coder? If you look through the trellis diagram, you can find 00 output at time 1, followed by a 00 output at time 2, but you see it can't be followed by the output 01 at time 3. So the received v = (00 00 01) is not a possible output of the channel coder.

Therefore, one thing the channel decoder can do is look at the received v (for example, v = (00 00 01)) and ask itself if it matches an output path through the trellis. If the answer is no, then there must be an error in transmission. That's an easy way to detect errors.

A convolutional decoder can also correct errors using the trellis diagram. To do this, it simply asks itself the following: Given v = (00 00 01), what is the path in the trellis closest to this v (in terms of fewest bits different)? Exhaustively searching all paths in the trellis, the closest path to v = (00 00 01) is u′ = (00 00 00), as seen in Figure 7.7(a). The channel decoder decides that the sent bits must have been u´ = (00 00 00). "If these are the sent bits, then the input that created them is m´ = (0 0 0)," says the channel decoder, "and so we have decided on our output."

There is a simple tool that we can use to find m′ once we have figured out u′, the closest path in the trellis to the received bits v. All we have to do is look at the trellis– at the series of solid and dashed lines that corresponds to u´. A solid line tells us a 0 is the bit in m′ and a dashed line tells us that a 1 is the bit in m′. Let me explain by example, using Figure 7.7(b) to help. In Figure 7.7(b), you see that u´ = (00 00 00) is the output path corresponding to the top lines of the trellis. Now, if you look at the first branch of the path, there you see below the 00 a solid line– that tells you the first output bit in m´ is a 0. If you look below the second 00, you see a second solid line– this tells you the second output bit is a 0, and so on.

So, the function of a convolutional decoder is really quite simple. It looks at v, then looks at the trellis diagram and searches it until it's found the output path in the trellis u´ closest to v. From this u´, it decides on and outputs m´.

Example 7.2

For the convolutional coder described by the trellis diagram of Figure E7.2, determine the output of the channel decoder when the channel decoder receives v = (11 11).

Solution:

The trellis diagram is drawn over two times in Figure E7.3. Looking over at this figure, we see that the output of the trellis coder can be (11 11) when it follows the shaded path. This tells us that the input bits that were sent must have been m = (0 0), since: the best path corresponds to two solid lines, each solid line indicating an input bit of 0.

7.2.2 The Viterbi Algorithm

One thing I haven't yet mentioned– if you have received, for example, v = (00 00 01 11 10 00 10 00 00 11 11 11 01 10 01 01 01 11 11 01), then you have to search through the trellis to find the path closest to this v. But that search is a long one. Luckily, a fellow named Viterbi presented a simple way to look through the trellis. The Viterbi Algorithm (VA for short) lets you (or better yet, your computer or DSP chip), find the closest path to v in the trellis easily and effectively.

Getting the Basic Idea The Viterbi Algorithm is based on a very simple idea: Start at the time 0 in the trellis, and move through the trellis from left to right. At each time, you can systematically eliminate some of the paths in the trellis as being closest to v. In fact, according to the Viterbi Algorithm, you can eliminate (at every time) all but four of the paths through the trellis as being closest to v, for the trellis of Figure 7.5. (In general, you can eliminate all but "s" paths through a trellis at any one time, where "s" is the number of states.)

This path elimination is briefly explained with the help of Figure 7.8. At time L+1, look at the top node. There are two paths that head into this node, one originating from parent node A and the other originating from parent node B. At time L+1, you can make a decision as to who is the better parent and choose between A and B. You repeat this for every node at time L+1, thus leaving you with four paths through the trellis at every time.

Understanding by Example Let's say we input to the channel coder of Figure 7.2 the bits m = (0 0), in which case it outputs the bits u = (00 00). Let's also say the channel decoder receives v = (00 00). Let's use the Viterbi Algorithm to search for u´, the path through the trellis closest to v. Once we've found u´, we can output m´, our guess at the bits sent into the channel coder.

First, we draw the trellis, and we associate with each start node the number 0. This means that we are not biased toward any one start node over any other. This is shown in Figure 7.9(a). Then, we start by examining the top node at time 1, which you can see in Figure 7.9(b). For this node, there are two possible parent nodes, node 0 at time 0 and node 1 at time 0. We are going to decide which is the best parent node, using this procedure:

1.: If we started at node 0 (time 0) and moved to node 0 (time 1), the first output bits would be 00. Comparing this to v where the first two output bits are 00, we say "0 bit errors if parent is node 0 (time 0)." We add this 0 to the 0 number that we gave node 0 (time 0) in Figure 7.9(a), for a grand total of 0.
2.: If we started at node 1 (time 0) and moved to node 0 (time 1), the first output bits would be 11. Comparing this to v where the first two output bits are 00, we say "2 bit errors if parent is node 1 (time 0)." We add this 2 to the 0 number that we gave node 1 (time 0) in Figure 7.8(a), for a grand total of 2.

Now, since starting at node 0 (time 0) and moving to node 0 (time 1) creates the fewest total bit errors (0), we proclaim that the parent node for node 0 (time 1) is node 0 (time 0), and that it carries with it the number 0 (for zero total errors with this selection). We repeat this for node 1 (time 1). You can see this ongoing in Figure 7.9(c). There, node 2 (time 0) and node 3 (time 0) are possible parent nodes. We decided between these, as follows:

1.: For node 2 (time 0) as starting node moving to node 1 (time 1), the output is 10. Comparing this to the first two bits of v, which are 00, we say "1 bit error if parent node is node 2 (time 0)." We add this 1 to the 0 number that we gave node 2 (time 0) in Figure 7.9(a), for a grand total of 1.
2.: For node 3 (time 0) as starting node moving to node 1 (time 1), the output is 01. Comparing this to the first two bits of v, which are 00, we say "1 bit error if parent node is node 3 (time 0)." We add this 1 to the 0 number that we gave node 3 (time 0) in Figure 7.9(a), for a grand total of 1.

Since starting at node 2 (time 0) or node 3 (time 0) and moving to node 1 (time 1) creates the same total bit errors, we proclaim that the parent node for node 1 (time 1) is node 2 (time 0) (we use a tie-breaker rule of "always choose the top node in case of a tie"). That carries with it the number 1 (for one total error with this selection).

We repeat this for node 2 (time 1) and node 3 (time 1), and the result of doing this is shown in Figure 7.9 (d). That is all we do for our first move from left to right through the trellis.

At the next time, we do a very similar thing. We start again at the top node, this time starting with node 0 (time 2). Looking at Figure 7.10(a), we can see that this node has two possible parent nodes, which are node 0 (time 1) and node 1 (time 1). We decide between these two nodes as follows:

1.: If we started at node 0 (time 1) and moved to node 0 (time 2), the second set of output bits would be 00. Comparing this to v where the second set of output bits are 00, we say "0 bit errors if parent is node 0 (time 0)." We add this 0 to the 0 number that we gave node 0 (time 1) in Figure 7.9(b), for a grand total of 0.
2.: If we started at node 1 (time 1) and moved to node 0 (time 2), the second two output bits would be 11. Comparing this to v where the second two output bits are 00, we say "2 bit errors if parent is node 1 (time 1)." We add this 2 to the 1 number that we gave node 1 (time 1) in Figure 7.9(c), for a grand total of 3.

Since starting at node 0 (time 1) and moving to node 0 (time 2) creates the fewest total bit errors (0), we proclaim that the parent node for node 0 (time 2) is node 0 (time 1), and that it carries with it the number 0 (for zero total errors with this selection). We repeat this for node 1 (time 2), node 2 (time 2) and node 3 (time 2); the results are shown in Figure 7.10 (b).

We continue this process until we have run through the trellis for the length of time corresponding to the length of v. In our case, since v consists only of two sets of 2 bits (v = (00 00)), we are done after two times. At the end, we have four end nodes with four numbers. For example, end node 0 (time 2) comes with value 0, while end node 2 (time 2) comes with the value 1. We choose the end node with the smallest value. In our example, looking at Figure 7.10, we choose the end node 0.

From here, we know the history of parent nodes, so we can "backtrack" through the trellis, and determine the u´ and the m´. And we're done. In our case, we choose node 0 (time 2) with a final value of 0, and we "backtrack" as shown in Figure 7.11. This leads us to the output m´ = (0 0).

Now you know just how the channel decoder works to undo the effects of channel coding, and along the way correct bit errors.

Example 7.3

Use the Viterbi Algorithm to determine the output of a convolutional decoder, given

•: the input bits to the decoder are v = (11 11) and
•: the convolutional coder is described by the trellis diagram in Figure E7.2.

Solution:

Figure E7.4 shows the trellis diagram when the Viterbi Algorithm is performed. It shows (1) the number-of-errors computation for each branch, (2) the best parent selection at each node, (3) the very best final node selection; and (4)backtracking to determine the best path through the trellis. From this best path, the decoder decides that the best output is (0 0).

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780080518671500135

Channel Filtering and Equalizers

Carl Nassar , in Telecommunications Demystified, 2001

Solution:

Figure E9.1 shows the trellis diagram characterizing the received signal in Equation (E9.19). This diagram is created from Equation (E9.19) using the four steps outlined in the writing you just read.

Also shown in Figure E9.1 is the application of the Viterbi algorithm to determine the best path through the trellis. For each node, the best parent is found by choosing the parent with the smallest total distance. When we get to the last nodes, we look for the node with the smallest total and we backtrack through the trellis asmarked by the arrows. These arrows tell us that the output of the trellis should be (-1 1) (based on the branch lines of the best path).

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780080518671500159

Information Theory and Coding

Richard E. Blahut , in Reference Data for Engineers (Ninth Edition), 2002

State Diagrams and Trellises

A constrained channel is one that does not accept input sequences containing any of a certain collection of forbidden subsequences. Such a channel can be described by a state diagram. Each state, s ₀, s ₁, …, s _m–1, corresponds to a recent past history of channel inputs. Loosely, we say that the channel is in one of these states, but we mean that its past history is described by that state. For each state, only certain symbols from the set a ₀, …, a _I–1 can be transmitted next. When one of these has been transmitted, the state changes to a new state depending both on the old state and on the particular symbol transmitted.

Sometimes the state diagram is augmented by a time axis so that one can see how the channel state changes with time. It is then called a trellis.

The telegraphy channel gives a simple illustration of a state diagram. There are two states. The state specifies whether or not a space was the last symbol transmitted. In state s ₀, only a dot or a dash can be sent next, and the state always changes. In state s ₁, any symbol can be transmitted, and the channel changes state if a space is sent; otherwise it remains in the same state. The conditions are indicated in the state diagram shown in Fig. 3.

The trellis diagram for the telegraphy channel is shown in Fig. 4. The Morse code is a collection of some of the paths through this trellis. An optimum code would use all of the paths, but the mapping from the set of source output sequences to the set of trellis paths might be a complicated one.

The digital magnetic-recording channel is another important example of a constrained channel. The channel input alphabet is binary, and a typical set of channel constraints, called run-length constraints, are that at least r zeros follow every one and that no more than s zeros occur in sequence. These constraints are imposed to minimize intersymbol interference and to facilitate clock recovery at the receiver.

In another kind of application, certain bit sequences are used for higher-level protocols or punctuation in the communication system and so must be forbidden in the data. A code must be used to translate from the set of unconstrained binary sequences into the set of sequences that satisfy the constraints.

Fig. 5 is a state diagram for a channel with r = 2 and s = 7 run-length constraints. Fig. 6 is the trellis diagram for the same channel; the trellis is drawn with state s ₂ as the initial state. The characteristic equationfor this state diagram is found to be x ⁸ − x ⁵ − x ⁴ − x ¹ − x ² − x − 1 = 0. The capacity, C, of the channel, equal to the base-two logarithm of the largest zero of the characteristic polynomial, is 0.518 bit/bit. Codes exist that record 0.518 bit of information for each channel bit recorded, but no codes can do better. Practical codes in use fall a little short of the channel capacity. The Franaszek code ^* of Fig. 7 is a variable-to-variable-length code whose rate is 0.5. Hence, it achieves 96 percent of the channel capacity and yet is quite simple. A code that comes closer to channel capacity may be quite complex.

By inspection, one can see that any codeword of the Franaszek code followed by any other codeword satisfies the channel run-length constraints, and the codewords are uniquely decodable without punctuation. This last condition, known as the prefix condition, is due to the fact that no codeword looks like the beginning of any other codeword. This is an important requirement of a variable-length block code since the decoder must be able to break into codewords any concatenated string that is not punctuated.

Codes more elaborate than the Franaszek code are known, but they cannot be understood without developing more theory. The Adler-Hassner finite-state tree codes ^* encode a continuous bit sequence into a continuous bit sequence without word boundaries. Fig. 8 shows the encoder and decoder for an Adler-Hassner code for the r = 1, s = 3 run-length-limited channel. This channel has a channel capacity of 0.5515 bit/input. The rate one-half Adler-Hassner code implemented in Fig. 8 achieves 91 percent of the channel capacity. Binary symbols are clocked from the encoder into the channel and from the channel into the decoder at twice the source rate. Another Adler-Hassner code, say a rate 11/20 code, could achieve a larger fraction of a channel capacity, but the implementation would be much more complex.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780750672917500273

Trellis-Coded Modulation (TCM)

Carl Nassar , in Telecommunications Demystified, 2001

Problems

1.

You want to build a TCM system that is to use the convolutional coder of Figure Q8.1 and the modulation scheme of Figure Q8.2.

(a): Draw the trellis diagram representing the convolutional coder.
(b): Assign modulation symbols to output bits by using mapping -by -set -partitioning. Provide a complete trellis diagram including modulation outputs.
(c): Draw the TCM decoder front end.
(d): Given A = 1 and that the output of the TCM decoder front end is (0.7, 0), (1.2, 0), (3.3, 0), (1.2, 0) use the Viterbi Algorithm to determine the output of the TCM decoder.

2.

Consider the convolutional coder of Figure Q8.3. Assume that the modulation is 32-ASK. Using your understanding of trellis diagrams and mapping-by-set partitioning, draw a trellis diagram that includes the output of the modulator.

3.

Provide a trellis diagram and a block diagram that fully describes a TCM coder meeting the following criteria:

•: The channel coder must have 1 bit input and 2 bits output;
•: The modulation scheme must be QPSK.

4.

Describe the benefits of using the TCM of Figure Q8.1 in place of using the same convolutional coding followed by a 4-PSK modulator.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780080518671500147

Convolutional Codes

Takahiro Yamada , in Essentials of Error-Control Coding Techniques, 1990

4.3.1 Trellis Diagram for a Convolutional Encoder

The Viterbi algorithm finds the maximum-likelihood code sequence x(D) using a diagram called a trellis. In Section 4.1 , we saw that a convolutional encoder can be represented by a state diagram. The trellis diagram shows every possible sequence of state transitions of the state diagram.

Let's consider the encoder of Fig. 4.4. The state diagram of this encoder is shown in Fig. 4.5. If we draw every possible state transition sequence of the encoder for the first four transitions, assuming that the initial state is 00, we get the trellis diagram of Fig. 4.6. In this diagram, transitions caused by an input one are indicated by dashed branches, while transitions caused by a zero are indicated by solid branches. The symbols shown along each branch are the output symbols generated by the encoder. For example, if the state of the encoder at time t is 00 and the input symbol i _t = 0 is fed into the encoder, then the encoder generates the output symbols x _t = (11) and the state becomes 01. This is represented by the dashed branch from node 00 at node level T to node 01 at node level T + 1 in the trellis diagram. If an input sequence 1101 is given to the encoder with state 00, the output sequence is 11 10 10 00, which is indicated by the state transition path 00 → 01 → 11 → 10 → 01 in the trellis.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123707208500088

Equalization

J. Barry , in Academic Press Library in Mobile and Wireless Communications, 2016

8.3.1 The Trellis Diagram

Before adding noise, the channel model in Eq. (8.3) applies the sequence of information symbols a _k to a linear filter with impulse response h _k, producing the filter output s _k = a _k * h _k. For the case when the channel memory is finite ( $μ < \infty$ ), it will be convenient to view this filter as a finite-state machine (FSM). In particular, let θ _k = [a _k−1, a _k−2, …, a _k−μ] denote the filter state at time k. Because each symbol $a_{k} \in A$ is chosen from a finite alphabet, the number of such states is clearly finite, namely $Q = | A |^{μ}$ . More than just having a finite number of states, the filter is an FSM because it further satisfies two additional properties. First, the current output s _k is uniquely determined by the pair (a _k, θ _k), a fact that becomes clear when we rewrite the convolution s _k = a _k * h _k as follows:

(8.11) $\begin{array}{l} s_{k} = h_{0} a_{k} + [h_{1}, h_{2}, \dots, h_{μ}] θ_{k}^{T} . \end{array}$

Second, the next state θ _k+1 is also uniquely determined by the pair (a _k, θ _k); this is also clearly true, since the next state θ _k+1 can be formed by concatenating the current input a _k with the first μ − 1 entries of the current state θ _k. The key to the FSM formulation is that the state sequence satisfies the Markov property P( θ _k+1| θ ₀, θ ₁, …, θ _k) = P( θ _k+1| θ _k). Intuitively, this means that knowledge of the state at time k tells us everything we need to know about what the next state at time k + 1 might be; the history of how we got there is irrelevant. Two tapped-delay line models for the FSM are shown in Fig. 8.3.

So far, we have assumed that the transmitter sends a symbol sequence of length L, starting with a ₀ and ending with a _L−1. To proceed further we need to clarify what happens before a ₀ and after a _L−1. While it might seem natural to assume that nothing is transmitted before and after, namely to assume that a _k = 0 for all k∉{0, …, L − 1}, this can be problematic when $0 \notin A$ , because it would require that an augmented alphabet $A \cup {0}$ is used to define the states during the transients at the beginning and ending of the message. Besides, in practice it is more common to use preambles or postambles of noninformation-bearing symbols for tasks such as frame synchronization and channel estimation. To better capture this reality we will identify one symbol from the alphabet as the idle symbol, denoted $a^{0} \in A$ , and we will assume that the transmitter uses both a preamble and a postamble, each consisting of a block of μ idle symbols. In other words, we assume that a _k = a ⁰ for k ∈{−μ, …, −1} and k ∈{L, …, L + μ − 1}. The preamble and postamble are shown in Fig. 8.4. The preamble ensures that the state is in the all-idle state at time 0, namely θ ₀ = [a ⁰, …, a ⁰], while the postamble ensures that the state is in the all-idle state at time L + μ, namely θ _L+μ = [a ⁰, …, a ⁰].

In what follows we associate an integer label p ∈{0, …, Q − 1} with each state, and we will reserve the zero label for the all-idle state; with this notation, the preamble and postamble ensure that both θ ₀ = 0 and θ _L+μ = 0. The symbols in the postamble are sometimes referred to as termination symbols, since their purpose is to terminate the trellis to a known state (namely, state 0).

The trellis diagram is defined simply as a plot of all possible states versus time.

Example 8.2

Suppose a block of L = 5 binary symbols $A = {\pm 1}$ are transmitted across an ISI channel with memory μ = 2, sandwiched between a preamble and postamble of two idle symbols a ⁰ = −1. There are four possibilities for the state θ _k = [a _k−1, a _k−2]: it can be [−1, −1], [+1, −1], [−1, +1], or [+1, +1], which are assigned the integer labels 0, 1, 2, and 3, respectively. The corresponding trellis diagram is shown in Fig. 8.5A. The preamble ensures that the state begins at state 0 at time 0. The state at time 1 is either 0 or 1, depending on the value of a ₀. The trellis terminates at state 0 at time 7. The path corresponding to the message [a ₀, …, a ₄]= [−1, +1, +1, −1, +1] is highlighted. In all, there are $| A |^{L} = 2^{5} = 32$ distinct paths through this trellis, one for each possible message.

Longer messages, larger alphabets, and higher ISI memory can lead to a more complicated trellis.

Example 8.3

Suppose a block of L symbols chosen from an alphabet of size $M = | A | = 4$ (such as 4-QAM or 4-PSK) is transmitted across an ISI channel with memory μ = 2. In this case the state θ _k = [a _k−1, a _k−2] is the previous two symbols, so that the number of states is $Q = | A |^{μ} = 16$ . The preamble and postamble ensure that the starting state (at time k = 0) and the ending state (at time k = L + μ) are both zero. The corresponding trellis diagram is shown in Fig. 8.5B. The trellis diagram has a total of L + μ stages. The first L stages of the trellis correspond to the L message symbols. In these stages there are $M = | A | = 4$ branches emanating from each node, one for each possible message symbol. The last μ stages of the trellis correspond to the postamble idle symbols. In these stages there is only one branch emanating from each node, namely the branch corresponding to the idle symbol.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123982810000089

Error-Control Coding

Ali Grami , in Introduction to Digital Communications, 2016

10.5.3 Trellis-Coded Modulation

In the absence of trellis-coded modulation, at the transmitter, encoding is performed separately from modulation, and at the receiver, demodulation (detection) is carried out independently from decoding. Error-correcting codes provide improvements in error-rate performance at the cost of bandwidth expansion (i.e., a reduction in bandwidth efficiency is traded for an increase in power efficiency). In the case of band-limited channels, such as telephone channels, an improvement in error rate performance is not possible, unless some types of combined modulation and coding, called trellis-coded modulation, is employed. The performance improvement due to trellis-coded modulation is not at the expense of bandwidth expansion, but at the expense of decoding complexity.

A limitation to the performance of error-control codes is the modulation technique, as selection of a code must go hand-in-hand with the choice of modulation technique for the channel. Trellis-coded modulation (TCM) combines an amplitude and/or phase modulation signaling set with a trellis-coding scheme, such as convolutional codes. The number of signal points in the constellation using TCM is larger than what is required when coding is not employed. This additional signal points, at a constant power level, can decrease the minimum Euclidean distance within the constellation. However this reduction in minimum Euclidean distance can be well compensated by an increase in the distance due to channel coding such that the error rate significantly improves only at the expense of the bandwidth conserved due to the use of a larger set of signal points.

The trellis-coding scheme, such as a convolutional code that has memory, allows only certain sequences of signal points for transmission. In the design of the trellis-codes, the emphasis is on maximizing the Euclidean distance between codewords rather than maximizing their Hamming distance. Using optimal soft-decision decoding algorithm for convolutional codes, such as the Viterbi algorithm, the possible sequence of signals is modeled as a trellis structure. Assuming an AWGN channel, maximum-likelihood decoding of a trellis code consists of finding that particular path through the trellis with minimum Euclidean distance to the received sequence.

The key concept in the construction of efficient TCM schemes is set partitioning. The approach is to partition an M-ary constellation successively into 2i sub-sets with size M/2i signal points, where i is a positive integer, with the goal to further increasing minimum Euclidean distance between their respective signal points. A set is partitioned into two sub-sets that are congruent and ensure the signal points within each partition are maximally apart. The process continues as far as the code allows.

Figure 10.10 shows the block diagram of a TCM scheme. The incoming block of k information bits consists of two streams of k ₁ bits and k ₂ bits (i.e., $k = k_{1} + k_{2}$ ). The first k ₁ bits are the input to an (n ₁, k ₁) encoder, and the output thus consists of n ₁ bits. The signal points in the constellation are partitioned into $2^{n_{1}}$ sub-sets. The remaining k ₂ information bits determine a signal point in a given sub-set (i.e., in each sub-set there are $2^{k_{2}}$ signal points). In other words, the constellation should contain $2^{n_{1} + k_{2}}$ points. This also leads to the number of steps required in partitioning of the constellation.

The objective of trellis-coded modulation is to increase the minimum distance between the signals that are the most likely to be confused, without increasing the average signal power. Even though the increase in signal set reduces the minimum distance between symbols, the Euclidean distance between trellis-coded sequences more than compensates for the signal points bring jammed together.

The Viterbi algorithm is employed in the decoding of TCM, as the modulator has memory and maximum likelihood sequence detection must thus be performed. First, the most likely signal point in each partition is determined, i.e., we find the signal point in each sub-set that has the smallest Euclidean distance from the received signal. This Euclidean distance is then used to find a path through the trellis diagram whose total Euclidean distance from the received sequence is minimum.

Example 10.12

Provide an example of set portioning for a coded 8-PSK signal set, where two information bits per point are transmitted and compare the result with an uncoded QPSK signal set.

Solution

Figure 10.11 shows the partitioning of an 8-PSK constellation. If the minimum Euclidean distance of an uncoded QPSK is $d_{1} = \sqrt{2}$ , the minimum distance of a coded 8-PSK is then $d_{2} = 2$ , as it represents the Euclidean distance between the antipodal signal points of the same sub-set. This can thus yield an asymptomatic gain of $20 log (2 / \sqrt{2}) = 3$ dB.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780124076822000107

Space Time Codes and MIMO Transmission

Burr Alister , in MIMO, 2011

3.4.1 STTC

The first space-time codes to be developed based on the code design criteria outlined in Section 3.3 were in fact trellis codes. The criteria, although they are couched in terms of block codes, can readily be applied also to trellis codes, as we will see. They were introduced in the same paper [TSC98] that described the rank and determinant criteria.

Trellis codes (or convolutional codes as they are known when they are simple binary codes) are based on an encoder having memory [B01], Chapter 7. The current output block (for STTC usually one symbol long) depends not only on the current data block (length $k$ symbols), but also on several previous blocks (the total number of blocks, including the current one, being the constraint length, $ν$ ). Typically, the blocks are very short, often including only one or two data bits. The encoder is implemented as a shift register. An example is shown in Figure 3.8, which shows the encoder for an example STTC with constraint length $ν = 2$ and input block length $k = 2$ . Each output space-time symbol depends on the current input block (two bits, entering the shift register in parallel), plus the two previous bits from the previous symbol period, stored in the two parallel shift register elements. These are fed to a space-time modulator, which maps the four resulting bits to two modulated symbols to be transmitted on two antennas.

The space-time modulator can be described by means of a generator matrix like that for conventional FEC codes. If we construct a vector of data bits in the form:

(3.26) $\begin{array}{l} d = [d_{i}^{0}, d_{i}^{1} \dots d_{i}^{k - 1}, d_{i - 1}^{0} \dots d_{i - 1}^{k - 1}, \dots d_{i - ν + 1}^{0}, \dots d_{i - ν + 1}^{k - 1}] \end{array}$

then the transmitted symbols are given by:

(3.27) $s = Mod {(d G)}^{T}$

where Mod(.) here denotes the modulation and G is the generator matrix. The elements of G are $M_{c}$ -ary, where $M_{c}$ is the number of points in the modulation constellation, and the matrix multiplication in (3.27) is carried out in modulo- $M_{c}$ arithmetic. In our example code, which is in fact one due to Tarokh in [TSC98], the generator matrix:

(3.28) $\begin{array}{l} G = [\begin{matrix} 2 & 0 \\ 1 & 0 \\ 0 & 2 \\ 0 & 1 \end{matrix}] \end{array}$

and the modulation is QPSK, with constellation points numbered in natural order, so that the point numbers correspond to complex modulated signals as shown in Figure 3.9.

A trellis code, as the name suggests, can be represented by a trellis diagram . The trellis diagram for our example code is given in Figure 3.10, which shows all the possible states of the encoder and all the possible state transitions as a function of time. Each column of the diagram represents a block (input bit) period, while each node represents the state of the encoder at the end of the input block period, and the branches show the permitted transitions between them. The branches are labeled according to the constellation point to be transmitted from each transmit antenna. Note that to avoid difficulty labeling the diagram the branch labels have been placed on the left of the diagram, adjacent to the state labels. They take the form of a list of the labels, which apply to the branches diverging from that state, from top to bottom. For the correspondence of input data to branches diverging from a state, we assume natural numbering from top to bottom: that is, the branches correspond to data $00, 01, 10 11$ .

The value of the trellis diagram lies in the fact that every valid code sequence corresponds to a path through the trellis. Unlike block codes, the output of a trellis encoder cannot be divided into independent blocks: we must deal with potentially infinite code sequences instead. The value of the trellis diagram lies in the fact that every valid code sequence corresponds to a path through the trellis. The decoder must find the sequence or path closest to the received signal, which is most likely to be the correct transmitted sequence. This is known as maximum likelihood (ML) decoding, since it returns the code sequence most likely to have been transmitted, given the received signal. An error event occurs when the decoder compares two candidate sequences corresponding to a pair of paths that begin and end on the same state, and selects the wrong one. The error event probability is the equivalent of the pairwise error probability in block codes, and can be bounded in the same way as described in Section 3.2. The code design criteria from Section 3.3 also can be applied to all such pairs of sequences in the trellis, in the same way as to all pairs of codewords in a block code.

Consider, for example, the pair of paths in the trellis of Figure 3.10 marked with dashed and heavy lines, respectively. The dashed line is the all zeros sequence, corresponding to data $00, 00 \dots$ , while the heavy line corresponds to data $01, 00 \dots$ The dashed line also generates a string of $0$ constellation symbols, i.e., signals $+ 1, + 1, \dots$ The data vectors d and modulated signals s corresponding to the heavy line are:

(3.29) $\begin{array}{l} d = [\begin{matrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{matrix}]; d G = [\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}]; s = [\begin{matrix} j & 1 \\ 1 & j \end{matrix}] \end{array}$

which correspond also to the branch labels in Figure 3.10. The codeword difference matrix corresponding to this pair of sequences, and its correlation matrix, is:

(3.30) $\begin{array}{l} D = [\begin{matrix} 1 - j & 0 \\ 0 & 1 - j \end{matrix}]; {D D}^{H} = [\begin{matrix} 2 & 0 \\ 0 & 2 \end{matrix}] \end{array}$

This has rank 2 and determinant 4, which corresponds to full diversity. From Figure 3.10 we note that on the path marked constellation symbol 1 (corresponding to data 01) is transmitted twice, once from the first antenna and once from the second. An error event is likely to occur only if the channels from both antennas are faded, which indicates that second order diversity should indeed be achieved.

We should use the same approach to search for the minimum rank and determinant of all codeword difference matrices corresponding to such pairs of paths. Because of the structure of the encoder and the linear generator matrix approach used to encode it, in fact we need only compare the all zero path (the dashed line and its extension) with all other paths which branch from it and return to it later in the trellis. It is quite easy to establish that all such paths give rank 2, and none gives a determinant less than 4. All in effect involve repeating a data symbol on each antenna. Hence, this code does achieve full diversity, 2.

The drawback of space-time trellis codes is that for larger numbers of antennas to achieve full diversity requires the encoder to have longer memory (a longer shift register), and the trellis to contain exponentially more states. Since decoding complexity is proportional to the number of trellis states and branches, this tends to become excessively computationally expensive.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123821942000034

Audio Alignment and Temporal Modeling

Theodoros Giannakopoulos , Aggelos Pikrakis , in Introduction to Audio Analysis, 2014

7.4 The Baum-Welch Algorithm

Instead of seeking the best-state sequence, it is often desirable to compute the sum of scores of all the sequences of states that are capable of generating a given sequence of observations. A small modification in Eq. (7.7) can provide the answer to this question. Specifically, if the maximization operator is replaced with a summation, Eq. (7.7) becomes:

(7.9) $α_{i} (t) = \sum_{j = 1}^{N} α_{j} (t - 1) A (j, i) B (x (t), i) .$

Equation (7.9) is computed for every node of the trellis diagram, following the node visiting scheme that was also adopted for the Viterbi algorithm. In the end, the values of the last column ( $T$ th column) are summed to produce the Baum-Welch score (or recognition probability), $r_{score}$ , of the HMM, given the sequence of observations:

(7.10) $r_{score} = \sum_{j = 1}^{N} α_{j} (T) .$

For this algorithm, it no longer makes sense to store the best predecessor of each node into a separate matrix and seek the best sequence of states, because every possible sequence of states contributes to the recognition score.

We have implemented a scaled version of the Baum-Welch algorithm in the scaledBaumWelchDisObs() function. The scaling technique is more complicated in this case, compared to the Viterbi method, and the reader is referred to [66] for a detailed description on the subject. To call the scaledBaumWelchDisObs() function, type:

The function has preserved the input format of the scaledViterbiDisObs() function. Obviously, the recognition score is the only output argument in this case.

Further to the functions that operate on discrete observations, we also provide their counterparts for the case of continuous features. The respective functions have been implemented in the scaledViterbiContObs() and scaledBaumWelchContObs() m-files. The notable difference is the format of the argument that refers to emission of observations at each state. In the case of continuous features, it no longer makes sense to use a simple matrix format and we need to resort to a more complicated structure based on cell arrays, so that the pdf of each state is modeled successfully. A more detailed description is given in the next section in conjunction with the training procedure.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780080993881000078