Gilder beats Moore

Project Description

The technology to communicate data is improving at faster rate than the technology to process data. If the current trend continues the processing instead of the communication of data will become the bottleneck.

This project's focus lies on digital signal processing algorithms which contain feedback loops (for example the IIR filter). The challenge lies in implementing these algorithms in such a way that the sample rate exceeds the clock rate. The ideal is to find implementations that scale in K, where sample rate = K × clock rate.

Our secondary goals are to minimize power consumption and to keep the finite precision effects to a minimum.

There is also a pdf file availabe with a more extensive description of the goals of the project. The project document contains the planning of the project and can be downloaded as an pdf file.


Several people are working on this project:

Current Topics

The current focus of our research lies with these topics:

Vector Processor

In order to determine which instructions are usefull for a vector processor, we implemented several algorithms on a fictional vector processor. This is still work in progress, but I have looked at some algorithms and gathered my findings and notes in one pdf file.

We also looked at how the VLSI architectures for IIR filters could be implemented on a vector processor. The (unfinished) document containing our findings so far can be found in a pdf file. As it turns out it there are programs that have a constant number of clock cycles while they calculate a number of outputs which is linearly dependant to the vector size. In other words, the throughput is linearly related to the vector size; twice the vector size produces twice the throughput.

We also wanted to know how we can systematically derive IIR implementations for the vector processor. The (unfinished) document containing our findings so far can be downloaded as a pdf file. We found a connection between the design techniques for parallel programs (as taught in the course Ontwerp van parallele programma's/Design of parallel programs ) and the vector processor, but the research is far from done.

Adaptive Filters

We have looked at IIR implementations, but the implementations we considered assumed that the coefficients of the filter did not vary. Most of them even assumed some kind of preprocessing could be done on the coefficients to obtain some representation of the filter that was more benificial. We now consider filters where the coefficients are not invariant, but vary over time.

Least Mean Square

The Least Mean Square (LMS) adaptive filter is one of the most commonly used adaptive filters. It is possible to implement a parallel version of the LMS filter, but because of the feedback loops for the coefficient updates this takes a large amount of hardware. Since the adaptivity of the filter is, in practical cases, more important than the exact input-output relation, an altered LMS filter is used instead. This altered version ideally has the same adaptation behaviour as the original LMS filter, but costs less hardware to implement in parallel.

An example of such a filter is the PIPLMS (Pipelined LMS) filter described by Parhi in "VLSI Digital Signal Processing Systems" in section 10.8.2. This filter is based on a common LMS filter and has three tunable parameters, so we could described it as PIPLMS(M1,M2,M2'). The filter known as DLMS (Delayed LMS) is equal to PIPLMS(D,1,1), where D is the delay. Implementation of the PIPLMS filter on a SIMD processor is easy if M1 and M2 equal the processors' vector size. In that case the processor can produce a full vector of outputs per iteration.

The PIPLMS algorithm adapts the filter coefficients for each output, just like the original LMS algorithm. However, there is also the BLMS (Block LMS) algorithm which adapts the coefficients once for each block of outputs. This algorithm is also easy to implement on a SIMD processor if the block size equals the processors' vector size. In that case the processor can, again, produce a full vector of outputs per iteration.

In general the adaptation performance of the filter degrades with the size of the delay parameters (M1,M2 and D), but the throughput improves. Therefore the best choice of implementation will depend on the specific application in which the filter is used.

Future Topics

In the future we will probably consider adaptive equalizers.

Older Topics

A section of our website is devoted to topics we have researched in the past. We occasionally revisit these topics or still have to make a final version of the report, but in general the main focus of our research is not on these topics.


The bibliography contains articles about this project's, and other, subjects.

We also have a page called "The Records" with documents containing some interesting things we found during our research

The slides of the presentations I have given can be found on the presentations page.


We welcome questions, remarks and suggestions. If you have any, you can contact us at