Download Transmitting TLM Transactions over Analogue Wire Models
Transcript
Transmitting TLM transactions over analogue wire models Stephan Schulz, Jörg Becker, Thomas Uhle, Karsten Einwich Sören Sonntag Fraunhofer Institute for Integrated Circuits IIS Design Automation Division Zeunerstr. 38, 01069 Dresden, Germany Lantiq Deutschland GmbH Design Platforms & Services Neubiberg, Germany Abstract—Nowadays digital systems have very high switching frequencies. Hence analogue effects can have a serious impact on data transmissions of connected modules in System-on-Chip (SoC) designs. The implications include attenuation, delay, and others which have to be considered as important effects. However, analogue technology models comprise too many details to be usable at system level as the simulation time would be far to high compared to traditional Transaction Level Modelling (TLM) models. In this paper we illustrate different aspects of using analogue line models as a transmission method for transactions between TLM models. This includes the introduction of analogue signal paths for TLM models and how to avoid the simulation time penalty of analogue technology models. We show how we can even use this approach to apply analogue effects to electronic system level (ESL) performance evaluations by further reducing the amount of details of the analogue effects. system. If they are not taken into consideration in the system during verification, they will not be discovered until hardware prototypes are examined. That results in a huge cost factor for the redesign of the whole system to work around these unwanted analogue effects. To manage this issue in an early phase of the system design, an analogue line model is used to transmit TLM transactions. This model is adjusted by values gathered from a simulation with a customised analogue technology model of a given physical interconnect. Transmission errors caused by high switching frequencies can be prevented in the digital systems by using error correction or other similar methods with respect to analogue line models. I. M OTIVATION Further motivation is given by the fact that parallel transmissions are rising problems concerning crosstalk and die area in System-on-Chip designs. Crosstalk and jitter are strengthened to a great deal because of chip shrinking which makes parallel lines more vulnerable to these effects. Higher bus frequencies of currently designed chips are even reinforcing these analogue effects. Thus, serial transmissions are introduced to be able to drive higher bus frequencies while rising data throughput, or at least maintaining it, while saving die area. In computer systems the most notable example for this is the switch from parallel (PATA) interfaces to serial (SATA) interfaces for magnetic and optical storage devices while offering higher data throughput. The switch from parallel to serial interfaces is not limited to off-chip interconnects but is also prevalent in on-chip interconnects. It is desirable to incorporate analogue effects in higher abstraction levels since the limitations of transmission throughput can be considered in the design process of the whole system. Analogue line models have to be simplified as they exhibit slow simulation speed that prevents the simulation of a large amount of lines as needed in SoC designs. However, the high number of analogue details is not necessary for higher abstraction levels as we will show in this paper. U 1 sing OSCI TLM -2.0 for system simulation holds some important benefits. The high level of abstraction makes it unnecessary to compute the signals of every pin which results in very high simulation speeds. Even very complex systems can be designed faster because the communication between modules is focused on transmitted data itself and not its transmission method. Thus, the physical layer of communication is not present in TLM-2.0 [1]. With our developed SystemC-AMS [3, 4, 9] line module, we have gained a speed-up of 6 · 104 in simulation time compared to a simulation with an analogue model resulting from layout extraction while maintaining all analogue effects. A. Analogue effects in digital systems A common problem in today’s systems are high switching frequencies that introduce analogue effects to the transmission paths of a digital system. These effects lead to transmission errors on an otherwise digital driven interconnect. Those errors have to be corrected or prevented in the first place and should be considered while simulating and verifying a given This work has been developed in the project RapidMPSoC. RapidMPSoC (project label 01M3085) is partly funded within the Research Programme ICT 2020 by the German Federal Ministry of Education and Research (BMBF). The authors are responsible for the content of the paper. 1 TLM (Transaction Level Modelling) is a design method which focuses on data transmission between two modules without considering the transmission in a real hardware. The Open SystemC Initiative (OSCI) TLM Working Group released a user manual [6] for their interpretation of transaction level modelling, called TLM-2.0. This standard is used interchangeable to the general term TLM in this paper. 978-3-9810801-6-2/DATE10 © 2010 EDAA B. Serial transmission II. A DDING A SIGNAL PATH TO TLM-2.0 SystemC [5] modules are usually connected by discrete event signals of a specified type. This does not apply for TLM modules which are connected sockets defined in TLM-2.0 [6]. These sockets will provide interfaces for the actual TLM transmission. TLM-2.0 defines three classes of interfaces: Figure 1. Abstract transmission of a transaction according to TLM-2.0 transport, direct memory, and debug transport. In our case we only use the transport interface in its non-blocking mode. Blocking transport can be easily replicated if needed and will not be covered explicitly. Therefore, on all occasions the nonblocking transport interface is assumed. The interfaces introduce certain methods which have to be implemented by TLM initiators and TLM targets. An initiator in the sense of TLM is a module which transmits or requests data from a target and is therefore the active component on an initiator-target connection. Usually, a SystemC module with TLM sockets also implements its interfaces, though this is technically not enforced. Addtionally, a SystemC module can be an initiator to multiple targets as well as a target to multiple initiators or both at the same time. Because communication between initiators and targets is implemented using sockets which provide interfaces, transmitting a transaction is only a method call from one SystemC object to another with the transaction being only referenced by a parameter (see Fig. 1). Consequently, no explicit signal path exists to transmit the transaction. Because there is no transmission of the actual transaction data, no delay or modification can be introduced without changing the initiator or target of a given transaction. A. Transmitting transactions A TLM interconnect module was developed for compensating this problem, which is called ‘Debugged Interconnect’. Interconnect modules in the context of TLM are SystemC modules which act as an intermediate pass-through for transactions. They have the ability to modify and delay a transaction within the forward and backward path of a transaction transmission but never create a transaction of their own nor are they a final target. Our goal is that a transaction reaches its target after the amount of time which is needed to transmit its payload over any given line model. Additionally, all transmission errors which occur due to physical constraints of the line model should be visible in the payload but not in the accompanying transaction flags and data fields. The benefit of such an interconnect module is, that no changes to the module logic are required in the original initiator or target modules. The insertion of an interconnect module and its impact on transaction transmission can be seen in Figure 2. The delays which result from sending the transaction payload over an arbitrary line model are displayed in red. Those delays Figure 2. Abstract transmission of a transaction according to TLM-2.0 with an interconnect module inserted will be determined by the analogue line model during the simulation. Any transmission error would only be visible in the payload to the transaction receiver. Flags and data of the transaction object will stay unmodified as the handling of transmission errors in these meta-data would complicate the whole transaction handling and would not add any value to our simulation efforts. B. Modifying transactions Each debugged interconnect module presents an interface to delay and modify a bypassing transaction payload using call-back mechanisms. This interface was introduced to keep the interconnect module generic for other use-cases. Every module which utilises this interface is called an ‘adaptor’. These adaptors have to register themselves to the specific interconnect module whose transactions shall be intercepted. If no adaptors are registered to a debugged interconnect, it will not modify or delay the transaction in any way and will not be noticed by any measurement due to this reason. The system will behave as shown in Figure 1. If a transaction reaches an interconnect with at least one registered adaptor, every adaptor is given the chance to modify or delay the transaction. Therefore the functionality of driving an arbitrary line model is contained in an adaptor which registers to such an interconnect. In Section IV we will describe how we connect the analogue line model using these adaptors into a TLM model to simulate physical constraints of transaction payload transmissions over an analogue line. There are also several other appliances as co-verification and logging of transactions which are not the focus of this paper. III. A NALOGUE LINE MODEL Models of analogue lines resulting from layout extraction are extremely detailed. Thus, slow simulation speeds of these models make them inadequate in system level design. Our model results from a layout extraction of an extracted real chip: A simulation of 1 ns of a data transmission with this model, excluding any Device Under Test (DUT), takes approximately 20 minutes. Adding a netlist with four DUTs slows down the simulation speed to 45 minutes for 1 ns of simulated time. These values were gathered while simulating the model of the circuit resulting from layout extraction shown in Figure 3. A commercial SPICE-like simulator was used to gather these results. Figure 3. Selected part with line and buffer Figure 5. Figure 4. Simulation of a model from layout extraction The solution to use analogue line models in system level design nonetheless lies in using an adjustable SystemC-AMS line model. The parameters of this SystemC-AMS model are extracted from the results of a simulation with the model from layout extraction. The key idea behind our SystemC-AMS line model is to do the time consuming simulations only once and build lookup tables from the simulation results. These lookup tables are used afterwards during the system level simulation with SystemC TLM and SystemC-AMS. A. Prerequisites The starting point for our investigations is to consider that the line model is only used for the transmission of digital data. Thus, we can use some simplifying assumptions for the system simulation. It is known that the signals on the lines change only between the digital states 0 (low) and 1 (high). The input signals shall be represented by a superposition (linear combination) of a finite number of delayed step signals, which maybe include a given rise and fall slope. In that case it is possible to compute the input and output signals of the circuit with the model from layout extraction for each change of the step signals in a preparing step. Furthermore, it is possible to represent also the output signals as linear combination of delayed step responses by using the values of the lookup tables. The size of the lookup tables depends on the step response granularity and has to be chosen by the user. B. Getting the model’s parameters An interconnect of a multi-chip system consists of several short lines connected by refresh buffers. These are necessary as otherwise the signal would be too much attenuated at the Signal response in forward mode simulation receiver’s end to be recognisable. The first step is to select a typical part of the interconnect design. Figure 3 depicts such a typical part containing lines and buffers. This approach also speeds up simulation and parameter extraction as there is no need to simulate the complete interconnect repeatedly. Only the highlighted part is used for the current simulation and parameter extraction for this reason. As our SystemCAMS line model is cascadable, there is no reason to simulate interconnects consisting of multiple typical parts. The complete, unchanged netlist from layout extraction is still necessary to get the rising time of the output of the DUT which was measured as 70 ps (see Fig. 4). Now the selected typical part can be detached as DUT for extracting parameters of this line model. Input of the DUT is the output of a voltage source with the given rising time of 70 ps and a series impedance matching the line impedance of the DUT. The schematic of the DUT is shown in Figure 3. The DUT has to be simulated twice, to get rise time, delay, and signal level in forward and reverse mode. This data is necessary for the SystemC-AMS line model (see Section III-A) to simulate accurately (e.g. signal reflexions at the end of the line). In forward mode simulation the voltage source with the series impedance is connected to the DUT’s input. As a real load the input of a second DUT is connected to the output of the examined DUT. Everything else of the second DUT stays disconnected as we only need the load behaviour of it. In reverse mode simulation the voltage source is connected to the DUT’s output. The output of another DUT is connected as a real load to the input of the examined DUT. As before all remaining connectors stay disconnected on the second DUT. Both circuits are combined in one netlist to create a single data file with all necessary traces in a single run. These four DUTs simulated at once consume a simulation time of about 45 minutes for 1 ns simulated time. This highlights that it is simply impractical to simulate whole systems that way. Figure 5 clearly shows the delay of the analogue line model in forward mode but no relevant signal flow was measured in reverse mode. C. Using the model’s parameters in SystemC The simulation data previously simulated with the model from layout extraction was used to simulate its behaviour with SystemC-AMS. For this purpose a SystemC-AMS module was developed which reads the parameter file at simulation start and adjusts its behaviour accordingly as already described in [8]. The user of the SystemC-AMS line model has to determine the simulation time resolution. Thus, the read data can be interpolated and normalised and then be stored in lookup tables. The interpolation method (linear or quadratic splines) can be chosen by the user as well. Furthermore, the signal delay is also computed from the simulation data of the model from layout extraction. D. Simulation cycle of the SystemC-AMS line model The sequence of input voltages ve1 is examined with regard to value changes (change from 0→1 or 1→0) at every time step. After detecting a value change the current voltage value ve1 (tn ), the corresponding voltage difference ∆ve1 (tn ) = ve1 (tn )−ve1 (tn−1 ), and the time tn are stored within a record which is added to a FIFO buffer. The simulation steps can be triggered by SystemC discrete-event signals. As already mentioned, the current values of the input and output signals can be represented by a linear combination of all signal changes up to the current simulation time, for instance X v2tab (t − tn ) ∆ve1 (tn ) , v2 (t) = (1) tn <t v2tab where denotes the appropriate tabulated step response matching the sign of ∆ve1 (tn ). With proceeding simulation time the sum grows. This would clearly impede the simulation speed. But we assumed furthermore that v2tab (t) remains almost constant for t ≥ t` (steady state). Then it follows X v2tab (t − tn ) ∆ve1 (tn ) tn ≤t−t` X = v2tab (t` ) ∆ve1 (tn ) tn ≤t−t` =: k1 (t) and combined with (1) X v2 (t) = k1 (t) + (2) tab v2 (t − tn ) ∆ve1 (tn ) . (3) t−t` <tn <t A special procedure avoids the accumulation of rounding errors during transient simulation with long simulation intervals. This procedure is explained in the following paragraph. The records in the FIFO buffer are sorted with respect to their time stamps. A record for which t − tn is greater than the settling time t` of the appropriate step responses is removed from the FIFO buffer. Thus, this record does no longer contribute to the sum in (3). Because of X ∆ve1 (tn ) = ve1 max tn (4) tn ≤t−t` tn ≤t−t` we can easily compute the constant k1 , which means that the voltage value stored in the record is multiplied by v2tab (t` ) and assigned to k1 (see (2)). The constant k1 equals zero at the beginning of the simulation (initial state of zero) and is updated correspondingly only if a record is to be removed from the FIFO buffer. For this reason the FIFO length cannot be greater than a fixed number which can be computed from the settling time t` and the clock frequency. On top of the SystemC-AMS line module a hierarchical SystemC module has been created which comprises the DUT in Figure 3. It consists of the line model itself and a module which detects threshold crossings. The threshold voltage of the circuit’s buffers has been determined by means of the simulation of the model of the previous mentioned typical part of the chip interconnect from layout extraction. But the threshold voltage in our SystemC-AMS module may also be assigned by the user. Furthermore, it can be parameterised to represent an arbitrary number of line sections in series. In this way, an arbitrary length of the modelled interconnect can be simulated. As a result the duration of a given simulation is only a fraction of the original duration to simulate the model from layout extraction while maintaining the same analogue effects. IV. I NTEGRATION OF THE ANALOGUE LINE MODEL INTO TLM-2.0 For the simulation of an analogue transmission of a transaction’s payload it is not enough to have the debugged interconnects (see section II-A) and the representation of an analogue line model as SystemC-AMS model. The challenge is that both are not connectable in a direct way because SystemCAMS and TLM-2.0 have no specified common interfaces to communicate with each other. A. Connecting SystemC-AMS to TLM-2.0 This was solved by developing adaptors which register themselves to debugged interconnects (see II-B) for reading and writing to a SystemC-AMS model. The architecture and implementation of the debugged interconnect are offering the use of a combined adaptor for reading and writing to the SystemC-AMS model, but for reasons of clarity and logic we used two debugged interconnects with a single adaptor registered to each one. For simulation purposes it makes no difference if they are combined or separated. The resulting transaction transmission layout is displayed in Figure 6: TLM initiator and target are the same modules as in the original design (see Fig. 1) and are connected with each other through two debugged interconnect modules (similar to Fig. 2). The analogue sender is a registered adaptor to the debugged interconnect on initiator side while the analogue receiver is a registered adaptor to the target side. Both are connected to the analogue line model through SystemC-AMSprovided analogue signal ports. B. Transmitting transaction payload The original transaction transmission process is split into several transmissions which are not visible to the initiator or target: 1) The initiator calls the TLM transportation method of the first debugged interconnect to transmit the transaction. This is transparent to the initiator because the correct binding (and therefore the correct interface) is provided Figure 6. Figure 7. Figure 8. Coupling of the analogue line model with a TLM-2.0 model Transmission of a transactions payload through an analogue line model Waveforms of a transmission start by a netlist as given in Figure 6. It is not differentiable for the initiator by means of TLM if the interface it has called is the one of the original target or the interconnect module. Therefore, the behaviour of the initiator is the same as if the target would be directly connected to the initiator. 2) As already described in Section II-B, the debugged interconnect notifies any registered adaptor of an incoming transaction. Therefore the write adaptor is given the chance to process the transaction. The debugged interconnect waits for an acknowledgement of the adaptor that the transaction has been processed after the notification. 3) The processing of the transaction in the write adaptor starts with transmitting the transaction payload immediately via the analogue channel (see signal writeOut1 in Fig. 8) with a configurable clock frequency. In order for proper synchronisation, a transmission is announced by a high-low signal sequence immediately followed by the bit-representation of the transaction payload. Thus, the payload 100 results in a transmission of 10100 as shown in Figure 8. If the adaptor finished sending the transaction payload, an acknowledgement is given to the debugged interconnect that it has finished to process the transaction. 4) Given the acknowledgement from its only adaptor, the debugged interconnect on the initiator side forwards the transaction. As mentioned before regarding the initiator, the debugged interconnect is not aware of which module is connected to its initiator port. The transaction is forwarded to the debugged interconnect on the target side as given by the netlist of Figure 6. This debugged interconnect works completely similar to the previous debugged interconnect which means to notify the read adaptor, and wait for the acknowledgement to forward the original transaction. 5) The read adaptor withholds the acknowledgement until a transmission was completely received from the analogue line model. Therefore any delay in the analogue line model will also delay the original transaction. As shown in Figure 8, the delay of this simulation was about 500 ps. The read adaptor synchronises itself to the transmission by waiting for the aforementioned highlow-signal sequence. Anything received afterwards is treated as payload bits of the transaction’s payload. The length of a payload transmission in bits is known to the read adaptor beforehand but could be completely detectable by using a special bit sequence which is not generated by encoding an arbitrary payload. By completion of the transmission, the value of the original transaction payload is replaced by the received payload from the analogue line model. Any transmission errors are therefore present in the later transaction process but could be detected already in the read adaptor by comparing the payload received from the analogue line model with the one received with its notification from the debugged interconnect. This mechanism can be used to eliminate permanent transmission errors. 6) After receiving the acknowledgement from the read adaptor, the transaction is forwarded to the intended target. Following these steps, the transmission has been delayed according to the constraints of the analogue line model and contains all transmission errors which may have happened. Source of transmission errors can be e.g. driving the clock frequency too high or line attenuation which both could lead to problems in detecting high and low signal values properly. This detection problem arises when signal values could not reach their voltage completely because the next bit pulls down or pulls up the voltage of the current value. In extreme situations this results in an indistinguishable signal noise where no message could be received properly. The whole transmission process measured to simulation time is displayed in Figure 7. Red simulated time indicates delays introduced due to the analogue line model. The simulated time of the payload transmission is c · (pb + sb ) + d where pb is the bit-length of the payload, sb is the bit-length of any used synchronisation and c is the time for a whole clock cycle. The variable d is determined by the transport delay imposed by the wire model which makes the difference between needed simulated time for the write and read adaptor. Additionally, a payload transmission reaches the read adaptor before a transaction notification if the transport delay of the analogue line model is shorter than the time to write the payload to the analogue line model. V. S IMPLIFIED LINE MODELS IN PERFORMANCE EVALUATION Apart from TLM simulations our approach can be extended to Electronic System Level (ESL) performance evaluations where the level of abstraction is higher than TLM. At performance abstraction communication timing is important while data is negligible. Performance simulations have the advantage of high simulation speed up to one order of magnitude higher than TLM simulations. In order to further abstract communication, we have implemented a simple channel that reflects timing by utilising the size of the message to be transmitted. Analogue effects have to be abstracted and can be included in the model using statistics for transmission errors. In case of a transmission error the message is marked as corrupted by the channel. We define two different cases: a) recoverable transmission errors and b) non-recoverable errors. In the former case, the receiver has to take care about proper message recovery. At performance abstraction this typically affects timing but has no influence on the message itself. In the latter case, however, the message must be deleted by the receiver and a re-transmission must be set up. For our performance evaluations we use the SystemQ [7] framework which provides several class libraries to model the aforementioned behaviour. VI. R ELATED WORK A similar approach was used in [2]. In this paper a loosely timed TLM model (TLM-LT) was coupled with timed syn- chronous data flow (TDF) models. These TDF models were written in SystemC-AMS. In contrast we use the approximately timed TLM models (TLM-AT) which frees us from the issues with the loosely timed models described in this paper. Although the TLM-LT models may be faster, we have yet to hit a design where this would justify the effort to solve the mentioned problems about ‘twisted time warps’ opposing to the more easily to implement TLM-AT models. VII. C ONCLUSIONS AND FUTURE WORK As shown in this paper, transaction level modelling (TLM) can be adapted to include analogue transmission effects while maintaining a high simulation speed. The use of an abstract analogue line model instead of the analogue line model resulting from layout extraction, yields a speed-up of simulation time from 20 minutes per 1 ns simulated time (without DUTs) to approximately 20 ms per 1 ns simulated time (including DUTs) which is a speedup factor of 6 · 104 . With this factor in mind, it is possible to design a whole TLM system with analogue transmissions without having simulation times beyond any practicability. This speedup can be increased by abstracting the analogue transmission to fit performance evaluation needs while maintaining its effects on transmissions. For demonstration purposes the current model uses only unidirectional payloads. Support for bidirectional payloads can be easily added in the same way the unidirectional model works by sending the payload over the analogue line model a second time. Further enhancements for simulating analogue effects on digital transmissions are planned. The next step is the incorporation of crosstalk. This has been already implemented in the SystemC-AMS line module but layout data of the current chip technology is still missing. An interesting issue of crosstalk is that transactions can be interrupted or injected by unexpected start/end-of-packet symbols. Additionally synchronisation bits missed by the analogue reader also need investigations because probably the whole payload will be missed or read in the middle of the transmissions. R EFERENCES [1] L. Cai and D. Gajski. Transaction level modeling in system level design. Technical report, University of California, Irvine, March 2003. [2] M. Damm, C. Grimm, J. Haase, A. Herrholz, and W. Nebel. Connecting SystemC-AMS Models with OSCI TLM 2.0 Models Using Temporal Decoupling. In Proceedings of Forum on Design Languages (FDL’08), 2008. [3] OSCI AMS Working Group. An introduction to modeling embedded analog/mixed-signal systems using systemc ams extensions. 7th Symposium on Electronic System-Level Design with SystemC, June 2008. [4] OSCI AMS Working Group. OSCI SystemC Analog Mixed-Signal Extensions, draft 1 edition, December 2008. [5] OSCI Language Working Group. IEEE Std 1666 - 2005 Standard SystemC Language Reference Manual, March 2006. [6] OSCI TLM Working Group. OSCI TLM-2.0 User Manual, June 2008. [7] S. Sonntag, M. Gries, and C. Sauer. SystemQ: Bridging the gap between queuing-based performance evaluation and SystemC. In Proceedings of Design Automation for Embedded Systems, pages 91 – 117, 2007. [8] T. Uhle, K. Einwich, and J. Haase. Efficient transient simulation of lossy coupled interconnects in digital communication applications. In Proceedings of Forum on Design Languages (FDL’07), 2007. [9] A. Vachoux, C. Grimm, and K. Einwich. SystemC-AMS Requirements, Design Objectives and Rationale. In Proceedings of Design Automation and Test in Europe (DATE’03), pages 388 – 393, 2003.