To the point & Out of the box


Between 1992 and 1998 I have worked a lot on massive parallel computer systems. I was founding member of PACT AG, Munich which invented one of the first run-time reconfigurable processor chips. I’m inventor of a bunch of things that were patented over time. Below you find a lit of my patents and a short abstract what it’s about.

Data Processing Device

DE 44 16 881 (US 08/544,435)

In connection with a data processing device, in which an integrated circuit (chip) (called a data flow processor - DFP - below) is provided with many cells, which in particular are arranged orthogonally to each other, are homogeneously structured and each have several logically similar and structurally identically arranged components, and the cells are connected to input/output connections of the integrated circuit by row and column, possibly combined into groups, according to the invention the cells have a load logic, via which they are programmable (configurable) individually and possibly combined into groups, in such a way that any logical functions and/or networkings with each other can be verified, specifically in such a way that the DFP configuration can be manipulated during operation (or at run time), i.e. functional parts (MACROS) of the DFP can be modified, without the necessity of stopping other functional parts or affecting their function.

Unit for processing numeric and logic operations for use in central processing units (cpus), multiprocessor systems, data-flow processors (dfps), systolic processors and field programmable gate arrays (FPGAs)

DE 196 51 075 (US 08/946,810; PCT DE97/02949)

An expanded arithmetic and logic unit (EALU) with special extra functions is integrated into a configurable unit for performing data processing operations. The EALU is configured by a function register, which greatly reduces the volume of data required for configuration. The cell can be cascaded freely over a bus system, the EALU being decoupled from the bus system over input and output registers. The output registers are connected to the input of the EALU to permit serial operations. A bus control unit is responsible for the connection to the bus, which it connects according to the bus register. The unit is designed so that distribution of data to multiple receivers (broadcasting) is possible. A synchronization circuit controls the data exchange between multiple cells over the bus system. The EALU, the synchronization circuit, the bus control unit, and registers are designed so that a cell can be reconfigured on site independently of the cells surrounding it. A power-saving mode which shuts down the cell can be configured through the function register; clock rate dividers which reduce the working frequency can also be set

I/O and memory bus system for dfps and units with two- or multi-dimensional programmable cell architectures

DE 196 54 595 (US 08/947,254; PCT DE97/03013)

A uniform bus system is provided which operates without any special consideration by a programmer. Memories and peripheral may be connected to this bus system without any special measures. Likewise, units may be cascaded with the help of the bus system. The bus system combines a number of internal lines, and leads them as a bundle to terminals. The bus system control is predefined and does not require any influence by the programmer. Any number of memories, peripherals or other units can be connected to the bus system.

Process for automatic dynamic reloading of data flow processors (dfps) and units with two or three-dimensional programmable cell architectures (fpgas, dpgas and the like)

DE 196 54 846 (US 6,088,795)

A method for processing data in a configurable unit having a multidimensional cell arrangement a switching table is provided, the switching table including a controller and a configuration memory. Configuration strings are transmitted from the switching table to a configurable element of the unit to establish a valid configuration. A configurable element writes data into the configuration memory. The controller of the switching table recognizes individual records as commands and may execute the recognized commands. The controller may also recognize and differentiate between events and execute a action in response thereto. In response to an event, the controller may move the position of a pointer, and if it has received configuration data rather than commands for the controller, sends the configuration data to the configurable element defined in the configuration data. The controller may send a feedback message to the configurable element. The configurable element may recognize and analyze the feedback message. An configurable element may transmit data into the configuration memory of the switching table.

Run-Time reconfiguration method for programmable units

DE 196 54 593

A method of run-time reconfiguration of a programmable unit is provided, the programmable unit including a plurality of reconfigurable function cells in a multidimensional arrangement. An event is detected. The source of the detected event is determined, and an address of an entry in a jump table is calculated as a function of the source of the event, the entry storing a memory address of a configuration for a reconfigurable function cell. The entry is retrieved and a state of a corresponding reconfigurable cell is determined. If the reconfigurable cell is in a reconfiguration state, the reconfigurable cell is reconfigured as a function of the configuration data. If the reconfigurable cell is not in a reconfiguration state, the configuration data is stored in a FIFO.

Method for the automatic address generation of modules within clusters comprised of a plurality of these modules

DE 197 04 044 (US 6,038,650)

A method of automatic address generation by units within clusters of a plurality of such units in which individual configurable elements of a unit can be addressed. It is thus possible to address the individual elements directly for reconfiguration. This is a prerequisite for being able to reconfigure parts of the unit by an external primary logic unit without having to change the entire configuration of the unit. In addition, the addresses for the individual elements of the units are automatically generated in the X and Y directions, so that the addressing scheme represents the actual arrangement of units and configurable elements. Furthermore, manual allocation of addresses is not necessary due to automatic address generation. In accordance with the present invention, a cluster is provided with a number of configurable units, each having two inputs for receiving the X address of the last element of the preceding unit in the X direction (row) and the Y address of the last element of the preceding unit in the Y direction (column) and having two outputs to relay to the next unit the position of the last element of the unit in the X direction and in the Y direction.

Internal bus system for dfps and units with two- or multi-dimensional programmable cell architectures, for managing large volumes of data with a high interconnection complexity

DE 197 04 742 (US 08/946,881; PCT DE98/00456)

An internal bus system for DFPs and units with two- or multi-dimensional programmable cell architectures, for managing large volumes of data with a high interconnection complexity. The bus system can transmit data between a plurality of function blocks, where multiple data packets can be on the bus at the same time. The bus system automatically recognizes the correct connection for various types of data or data transmitters and sets it up

Method of the self-synchronization of configurable elements of a programmable unit

DE 197 04 728 (US 6,081,903)

A method of synchronizing and reconfiguring configurable elements in a programmable unit is provided. A unit has a two or multi-dimensional programmable cell architecture (e.g., DFP, DPGA, etc.), and any configurable element can have access to a configuration register and a status register of the other configurable elements via an interconnection architecture and can thus have an active influence on their function and operation. By making synchronization the responsibility of each element, more synchronization tasks can be performed at the same time because independent elements no longer interfere with each other in accessing a central synchronization instance.

Method for hierarchical caching of configuration data having dataflow processors and modules having two-or multidimensional programmable cell structure (fpgas, dpgas, etc.)

DE 198 07 872

A method of caching commands in microprocessors having a plurality of arithmetic units and in modules having a two- or multidimensional cell arrangement is provided. The method includes combining a plurality of cells and arithmetic units to form a plurality of groups, assigning a cache unit to a group, and connecting the cache unit to a higher level unit via a tree structure. The cache unit may send requests for required commands to the higher level cache unit, which may return a command sequence including the required command, if the higher level cache unit holds the first command sequence including the required command in the higher level cache unit's local memory.

Geschwindigkeitsoptimiertes Cachesystem

DE 198 09 640

Several cache memories are used instead of a continuously large cache memory. Each memory has a defined address range. A plurality of arithmetic units can access a plurality of cache memories due to the fact that the cache memory is selected on the basis of defined addresses. If several arithmetic units access the same cache memory, one of the arithmetic units undergoes arbitration per time unit and is granted the right of access. If the data is not available in the cache memory, bursting occurs when accessing the memory, that is, a plurality of data is written on a complete line of cache memories (CL) in the memory or read from the memory.