• 沒有找到結果。

Chapter 4

Implementation

4.1 Hardware specification

4.1.1 Field programmable gate array

The field programmable gate array (FPGA) is the chosen technology for im-plementing the Belle II level 1 trigger logic. The advantages that FPGA brings to the trigger system include

• reconfigurable

This is the main reason that urges the trigger group to abandon the hard-wired logic applied in the Belle experiment. The flexibility not only shortens the developing time by a great deal, but also allows for giving preliminary triggers for early runs even before the trigger logic is finalized. During nor-mal operation time, the trigger can be modified at a deeper level to sensi-tively reflect the change of the ever-changing accelerator condition.

• high I/O count

The capability to handle high input count is essential for the trigger sys-tem due to the enormous amount of signal channels coming from the CDC sense wires and front-ends. With the addition of high speed serial I/O

.. transceivers to some recent FPGA models, more information can be feed to the trigger system for more sophisticated event identification.

• massively parallel

There are tens of thousands of configurable logic blocks (CLBs) in an ad-vancced FPGA. Together with high I/O count, these logic blocks can be connected to form truly parallel data paths. Namely, a simple logic can be replicated thousands of time, and all the computations can be performed simultaneously in real time.

• deterministic latency

FPGAs are rich in synchronous resource (namely, registers/flip-flops). When implementing a synchronous design on an FPGA, where the signal process-ing is pipelined, the total system latency can be decided precisely, and the output result is generally deterministic. This helps to eliminate the dead-time in the trigger system.

The FPGA is not with shortcomings, among which the most important ones to the L1 Trigger system are

• limited resource

The amount of resource is fixed after the decision is made to purchase and manufacture the electronics with a specific FPGA. When a sub-trigger mod-ule is too large to fit in the FPGA, it takes tremendous amount of effort to optimize the logic. Limited routing resource (i.e. the configurable intercon-nects) also means that it can be difficult to meet all the timing constraint requriements in complicated designs.

• error-prone

There are many pitfalls lurking on the journey to successfully designing an FPGA project. Many aspects of the design must be treated with great cau-tion, and they are not apparant to non-experts at fist glimpse. The difficulty

.. increase significantly when the design gets more and more complicated. Of course, as FPGA novices, the members of the trigger group learned most of the pitfalls at a great cost late in the design phase.

The 2D tracker is implemented on the Virtex-6 FPGA, part number XC6VHX565T-2FF1923. Its early prototype was implemented on another device in the same family, part number XC6VHX380T-2FF1923. Their features are summarized in table 4.1[109].

Table 4.1: Virtex-6 FPGA Feature Summary

Device XC6VHX380T XC6VHX565T

Logic cells 382464 566784

Slices 59760 88560

Distributed RAM (Kb) 4570 6370

Block RAM (Kb) 27648 32832

DSP48E1 slices 864 864

MMCMs 18 18

Max GTX transceivers 48 48

Max GTH transceivers 24 24

Max User I/O 18 18

Max I/O banks 720 720

4.1.2 The printed circuit board

UT3, which stands for the Universal Trigger Board 3, is a PCB developed in KEK for many of the trigger modules. All the Track Segments Finders, the 2D trackers, the conventional 3D trackers, the Neuro 3D trackers, the Global Recon-struction Logic, and the Global Decision Logic will all be implemented on the UT3.

Other than the Virtex-6 FPGA, it contains a SPI flash memory to store the firmware bitstream, a CPLD to program the FPGA at boot-up time, 8 pairs of LEMO con-nectors for NIM logic I/O (on the LVDS daughter board), 6 GTH Multi-Gigabit optical modules for high speed serial I/O, and VME bus connectors on the rear panel to communicate with the controlling processor. Fig. 4.1 shows the main board of UT3.

..

Figure 4.1: The main board of UT3. Figure courtesy of Jing-Ge Shiu.

..

4.1.3 Clock signals

The CDC wire hit, along with its hit time (in the form of TDC count), is digi-tized in the CDC front-end. Since all the signals that the L1 CDC trigger system receives from the front-ends are already digitized, it works in the digital realm completely. The clock signal is vital in any digital signal processing system; it is the heartbeat of the chip, without which no register will change its state, and no calculation will be performed1. When it comes to high speed serial data trans-mission, the quality of the clock signal can never be overemphasized. Even the slightest jitter in the 250 MHz reference clock may result in a total system failure in the optical transmission, where the serial data flow at a rate of 10 Gbps. This is why all the UT3s receive their clock signal from a source independent to the usual data acquisition clock signal used elsewhere in the Belle II experiment. Besides, all the UT3 need to received a common clock signal source in order to process different part of the same data.

There are 4 possible clock signal sources to UT3. Firstly, an on-board oscillator with 125 MHz clock allows for standalone operations on UT3. This is not used during normal operation. Secondly, a pair of synchronous 254 MHz and 127 MHz clock signals, originating from the 509 MHz SuperKEKB RF reference frequency and going through the Clock Master and the Clock Distributor, enter the front panel of each individual UT3. The 254 MHz clock is taken as the reference clock of the GTH transceivers and the source of the internal mixed-mode clock managers (MMCMs), which provides secondary clock signals to the global clock net on the FPGA. The 127 MHz clock is provided as the reference clock of GTX transceivers, and will be the input to the MMCM in the future. This allows multiple UT3s to form a trigger system to process the same incoming data. Thirdly, a common 127 MHz clock for Belle2Link, the data acquisition framework, enters UT3 through an RJ-45 connector. Finally, a 16 MHz clock signal for the VMEbus enters from the VME connector on the rear panel.

1Except for asynchronous systems, of course.

.. The different clocks don’t necessarily have a fixed phase relation. To improve the system stability while processing data among different clock domains, all clock domain crossing control signals are synchronized by two cascaded flip-flops, physically placed nearby with strict timing constraints, in the receiving clock domain. Another type of synchronizer based on block RAM is also devel-oped for the clock domain crossing data bus. If there are further instability, the data bus can be secured with this syncrhonizer.

4.1.4 Parallelism

With limited bandwidth (number of high speed serial I/O ports on UT3), one 2D tracker cannot receive all the necessary TS information. Thus, the task is split onto 4 UT3s, each receiving track segments from a different part of the CDC and searching for tracks whose tangent at the origin (𝜙0) lies in only a quarter of 2𝜋.

This also helps to relieve the burden on the limited computing resource in a single FPGA. The tracking output goes to individual 3D trackers and Neuro-triggers, finally combined and summarized in GRL and GDL (see Fig. 2.19).

The discretized Hough grid2 was divided into 4 parts in the 𝜙-direction, each containing 40 seed columns (90). To avoid finding the same track in different UT3s, the seed squares in the leftmost column check for connected squares in 2 extra columns to the left. Moreover, the seed squares in the rightmost column search for connected clusters in 4 extra columns to the right. In summary, the working Hough space on each 2D tracker is 46 columns.

The possible output 𝜙0range of a 2D tracker is 92.25. Since 2.25overlaps with another 2D tracker, there appears to be some ambiguity. However, the destination of the tracks in the overlapping area is actually uniquely determined by its cluster

—Only the 2D tracker containing the seed square (lower-left) of the cluster will find the track.

The required range of input track segment’s azimuth in each superlayer is

2See section 3.1.3.

.. larger than 92.25, for the low-transverse-momenta tracks on the edge of the quar-ter curl outward. Figure 4.2 shows the acceptance range of the first 2D tracker (2D0).

1 2

3