\section{Introduction}
The two heavy ion spectrometers (HADES~\cite{hades-web} and CBM~\cite{cbm-web}) at the GSI Helmholtz Center for Heavy Ion Research (Darmstadt, Germany) and the FAIR accelerator contain a Ring Imaging Cherenkov (RICH)
-detector for particle identification. The existing RICH at the HADES experiment\cite{hadesrich} is in operation since the year 2000. Originally, it was built using a reflective CsI photo cathode in combination with a MWPC plane for photon detection. At the moment it is being upgraded with a new MA-PMT readout plane consisting of 428 64-channel PMTs (Hamamatsu H12700). The sensitive area of about $1.3~\rm{m}^2$ will be covered with 28,000 individual PMT cells.
+detector for particle identification. The existing RICH at the HADES experiment\cite{hadesrich} is in operation since the year 2000. Originally, it was built using a reflective CsI photo cathode in combination with a MWPC plane for photon detection. At the moment it is being upgraded with a new MA-PMT\footnote{Multi-Anode Photo Multiplier Tube} readout plane consisting of 428 64-channel PMTs (Hamamatsu H12700). The sensitive area of about $1.3~\rm{m}^2$ will be covered with 28,000 individual PMT cells.
The RICH detector for the CBM experiment~\cite{cbmrich,cbmrich2,cbmrich3} will follow the same design, albeit with a larger read-out plane of about twice the size and an even larger sensitive volume. This detector is going to use identical electronics as the HADES setup, with modifications to the read-out system.
-As a third project, the PANDA experiment, to be built at FAIR during the next years, comprises a DIRC as one of its central parts that is planned to use identical electronics as well. The only component to change is the backplane to cope with different detector geometry and slightly larger MCP sensors.
+As a third project, the PANDA experiment, to be built at FAIR during the next years, comprises a DIRC\footnote{Detection of Internally Reflected Cherenkov light} as one of its central parts that is planned to use identical electronics as well. The only component to change is the backplane to cope with different detector geometry and slightly larger MCP\footnote{Multi-anode Micro Channel Plate} sensors.
\includegraphics[width=.5\textwidth]{figures/dirich_system.jpg}
\caption{\label{fig:module} A partially equipped $3\times2$ MA-PMT module. The $10\times15~\rm{cm}^2$ backplane provides all necessary interconnects between PMTs and read-out electronics.}
\end{figure}
-The read-out plane of both RICH detectors is segmented into small modules, consisting of an array of two by three photo multipliers. Such a module measures 10 by 15 cm$^2$ and houses 384 individual photon detection channels. Scalability dictates that all electronics necessary for this detector are to be integrated on the same footprint. Hence, a modular concept with individual plug-in cards has been developed. The main component is a backplane that connects all cards and is used to route all analog and digital connections as well as power lines and common high-voltage supply for all 6 PMTs. A partly equipped module is shown in figure~\ref{fig:module}.
+The read-out plane of both RICH detectors is segmented into small modules, consisting of an array of two by three photo multipliers. Such a module measures 10 by 15 cm$^2$ and houses 384 individual photon detection channels. Scalability dictates that all electronics necessary for this detector are to be integrated on the same footprint. Hence, a modular concept with individual plug-in cards has been developed. The main component is a backplane that connects all cards and is used to route all analog and digital connections as well as power lines and common high-voltage supply for all 6 PMTs. A partially equipped module is shown in figure~\ref{fig:module}.
\subsection{The DiRich Board}
\begin{figure}[htbp]
\centering % \begin{center}/\end{center} takes some additional vertical space
\includegraphics[width=.7\textwidth]{figures/dirich1_07_170816.jpg}
- \caption{\label{fig:dirich} The DiRich board. From left to right: Backplane connector, Amplifiers, TDC-FPGA with threshold filters, auxillary electronics, voltage regulators}
+ \caption{\label{fig:dirich} The DiRich board. From left to right: Backplane connector, Amplifiers, TDC-FPGA with threshold filters, auxiliary electronics, voltage regulators}
\end{figure}
The DiRich board (shown in figure~\ref{fig:dirich}) houses all electronics necessary for PMT read-out, from the analog pre-amplifiers to the digital read-out data stream. It is completely based on off-the-shelf components to be independent of hard to acquire ASICs. Each board has 32 input channels, serving one half of a MA-PMT. Both boards have to fit to the back of a single photo multiplier to allow for seamless scaling of the system. The size of each board is $47 \times 100 \times 10~\rm{mm}^3$. To keep the cooling requirements on moderate levels, a low power consumption was one of the key design aspects.
The central part of the system is formed by an FPGA. The FPGA does not only do time-to-digital conversion (described below), but also contains the signal discriminator, threshold generation and the complete DAQ network stack.
-Discrimination of input signals is realized in the devices LVDS receivers: One input is connected to the amplified signal, the other is supplied with an adjustable threshold voltage. The thresholds are generated by the FPGA using a 16 Bit delta-sigma DAC for each channel. Time measurement is accomplished by a tapped delay line TDC, capable of a precision down to 10~$\rm{ps}_{\rm{RMS}}$ (see below).
+Discrimination of input signals is realized in the FPGAs' LVDS receivers: One input is connected to the amplified signal, the other is supplied with an adjustable threshold voltage. The thresholds are generated by the FPGA using a 16 Bit delta-sigma DAC for each channel. Time measurement is accomplished by a tapped delay line TDC, capable of a precision down to 10~$\rm{ps}_{\rm{RMS}}$ (see below).
In the triggered read-out architecture of HADES, data is stored in internal buffers until a read-out request is received. A trigger window can be applied to recorded data before it is sent out on a 2 GBit/s link over the backplane. For data communication, the TrbNet~\cite{trbnet} protocol is employed. This protocol has been developed for the full data acquisition system of the HADES experiment and is able to transport trigger information, read-out data and slow-control simultaneously over the same serial link.
\caption{\label{fig:pulses} Two Examples of input (top) and output (bottom) signals.}
\end{figure}
-The typical input signals from the PMTs have a length of about 2~ns and an amplitude between 5 and 40~mV. Before this signal can be fed into a fast discriminator, it has to be amplified by a factor of 25. Figure \ref{fig:pulses} shows two different input signals (upper pane) and the corresponding output signals (lower pane).
+The typical input signals from the PMTs have a length of about 2~ns and an amplitude between 5 and 40~mV. Before this signal can be fed into a fast discriminator, it has to be amplified by a factor of 25. Figure \ref{fig:pulses} shows two different input signals (upper pane) and the corresponding output signals (lower pane). The small secondary peak at 17~ns is an artifact from the test setup.
-The discrete amplification stage is built around a wide-band transistor (BFU760F) as a common emitter amplifier. The typical resistor at the collector has been replaced by an inductor (L77 in figure \ref{fig:analog}) in this circuit for various reasons. First, it allows to use a low operating voltage of only 1.1~V while keeping the static current in the transistor low. Additionally, it helps in shaping the output signal as a high-pass filter. The rise time is preserved, but an undershoot is added at the end of the signal to help in returning to the baseline. In the current configuration, the amplifier takes only 50~ns to return to the baseline, avoiding pile-up and wrong time measurements for close signals. Lastly, the undershoot generates a fast crossing of the threshold resulting in a better time-over-threshold measurement. Each channel is galvanically isolated by a small transformer to decouple grounds of the PMT and the amplifier.
+The discrete amplification stage is built around a wide-band transistor (BFU760F) as a common emitter amplifier. The typical resistor at the collector has been replaced by an inductor (L77 in figure \ref{fig:analog}) in this circuit for various reasons. First, it allows to use a low operating voltage of only 1.1~V while keeping the static current in the transistor low. Additionally, it helps in shaping the output signal as a high-pass filter. The rise time is preserved, but an undershoot is added at the end of the signal to help in returning to the baseline. In the current configuration, the amplifier takes only 50~ns to return to the baseline, avoiding pile-up and wrong time measurements for closely spaced signals. Lastly, the undershoot generates a fast crossing of the threshold resulting in a better time-over-threshold measurement which is less prone to errors introduced by external noise. Each channel is galvanically isolated by a small transformer to decouple grounds of the PMT and the amplifier.
+
+The amplification stage has been measured to consume 12~mW per channel. The typical amplification varies between 28 for small signals and 18 for the largest signals of 40~mV amplitude. This is expected due to the limited slew rate of the amplification stage. As the amplitude of the resulting pulse is not measured, there is no negative influence on the accuracy of acquired data.
+
+The variations in gain between different channels vary between 22 and 27 for identical input signals, that is less than $\pm 10\%$. This is much lower than the intrinsic pulse height distribution of signals from the photo-multipliers which can vary by more than a factor two within one channel for single-photon signals. This allowed us to omit any feedback or gain stabilization circuitry that would hamper either timing or power consumption requirements.
+
+The amplified signal with a fast rise time of less than 1~ns is then fed into the inverting input of an LVDS receiver of the FPGA (LFE5UM-85F-8BG381C). The non-inverting input is supplied with an adjustable constant voltage. This threshold voltage for the discriminator is produced by a delta-sigma DAC output of the FPGA individually for each channel. The DAC uses a simple, two-stage low pass filter connected to the output pin. This DAC reaches a resolution of 38~$\upmu \rm{V}$ and shows no measurable ripple. The switching of the 32 channels is timed such that no two channels switch at the same time and the switching of each channel is limited to few MHz to keep the generated noise level as low as possible.
+
+The high resolution of the DAC is not strictly necessary for proper operation of the device, but is available at practically no additional cost. The intrinsic noise band of the discriminator is measured to be about 8~mV, implying that a resolution of two or three steps per millivolt would be sufficient. Nevertheless, the available resolution helps to accurately determine noise and signal distributions.
+
+The offset of the discrimination threshold varies strongly (by up to 30 mV) between channels as this is not a strictly controlled performance figure of an LVDS input. This can be accounted for by the individually adjustable threshold voltages for each channel. All channels can be set to a defined threshold by an automatic procedure that does a scan through voltages to determine the edge of the intrinsic noise band.
-The amplification stage has been measured to consume 12~mW per channel. The typical amplification varies between 28 for small signals and 18 for the largest signals of 40~mV amplitude. This is expected due to the limited bandwidth of the amplification stage. As the amplitude of the resulting pulse is not measured, there is no negative influence on the accuracy of acquired data.
-The amplified signal with a fast rise time of less than 1~ns is then fed into the inverting input of an LVDS receiver of the FPGA (LFE5UM-85F-8BG381C). The non-inverting input is supplied with an adjustable constant voltage. This threshold voltage for the discriminator is produced by a delta-sigma DAC output of the FPGA individually for each channel. The DAC uses a simple, two-stage low pass filter connected to the output pin. This DAC reaches a resolution of 38~$\upmu \rm{V}$ and shows no measurable ripple. The switching of the 32 channels is timed such that no two channels switch at the same time and the switching of each channel is limited to few MHz to keep the generated noise level as low as possible.
\subsubsection{Time Measurement}
-Time measurement is accomplished by a tapped delay line in the FPGA: The input signal travels along a line of about 300 delay elements (LUTs). If a hit is detected, the delay chain is read-out, decoded and stored in an internal buffer for read-out.
+Time measurement is accomplished by a tapped delay line in the FPGA: The input signal travels along a line of about 300 delay elements realized by routing the signal through basic logic blocks (Look Up Tables - LUT) of the FPGA. If a hit is detected, the delay chain is read out, decoded and the position is stored in an internal buffer for read-out.
-The intrinsic dead-time of each TDC channel is 15~ns. Hence, to be able to measure both edges of the few nanosecond wide input signals, an internal stretcher in the FPGA is used. The input signal is sent through in-FPGA routing and such delayed by 20 to 30~ns. This gives the TDC time to measure, decode and store the leading edge, before the delayed trailing edge is measured in the same TDC channel as well. The TDC implementation is described in further detail in \cite{tdc}.
+The intrinsic dead-time of each TDC channel is 15~ns. Hence, to be able to measure both edges of the input signals with a width of a few nanoseconds, an internal stretcher in the FPGA is realized. The input signal is sent through in-FPGA routing and delayed by 20 to 30~ns. The original leading edge is then combined with the delayed trailing edge to a single pulse and fed to the TDC. This gives the TDC time to measure, decode and store the leading edge, before the delayed trailing edge arrives and is measured in the same TDC channel as well. The TDC implementation is described in further detail in \cite{tdc}.
\begin{figure}
\centering % \begin{center}/\end{center} takes some additional vertical space
The actual timing precision of the DiRich module has been tested by supplying two channels of the board with a PMT-like input signal generated in a arbitrary waveform generator. A single output was used and the signal was split passively to guarantee a jitter-free relation between the signals at the two inputs. The measured time difference between the two channels was analyzed, the result is shown in figure~\ref{fig:timing}. Precision stays better than 20~$\rm{ps}_{\rm{rms}}$ for all expected input signals. The lower the signal amplitude, the lower is the timing precision achieved as the signal-to-noise ratio decreases and the slope of the signal gets slower. Nevertheless, even signals with only 1~mV amplitude can be detected with a very good precision of 61~$\rm{ps}_{\rm{rms}}$.
+This measurement of timing does not include timing errors due to walk effect depending on the input signal amplitude. If a very good precision is required, a walk correction can be applied in the offline data analysis, based on the time-over-threshold measurement. Nevertheless, the precision achieved in the full system is dominated by the transition time spread (TTS) of the MA-PMTs, which is in the order of 300~ps. For MCP-PMTs the spread is much lower, asking for a more precise offline correction.
+
+Another effect that needs to be accounted for is the temperature dependence of the FPGA-TDC. Our measurements have shown that this effect can be fully compensated by the online calibration procedure implemented in the data acquisition software as shown in \cite{tdc}.
\subsection{Supplementary Boards}
\begin{figure}[htbp]
\includegraphics[width=.4\textwidth]{figures/dirich_power.jpg}
\qquad
\includegraphics[width=.4\textwidth]{figures/dirich_concentrator.jpg}
- \caption{\label{fig:aux} The two auxillary boards: Power supplies (left) and data concentrator (right)}
+ \caption{\label{fig:aux} The two auxiliary boards: Power supplies (left) and data concentrator (right)}
\end{figure}
-The front-end module is complemented by two auxillary boards. The power board (figure~\ref{fig:aux}, left side) houses switching and linear voltage regulators to provide all necessary supply voltages. Additionally, trigger (reference time) and clock signals are distributed to all front-ends from this board. Two ADCs allow for detailed monitoring of all voltages and currents. The board currently foresees two possible powering schemes: A 24~V (8 -- 36~V) input and DC-DC converters provide the most simple external supply. A second option is the direct input of externally regulated low voltages (1.1~V -- 3.3~V). In this case, only linear regulators are active and the electromagnetic noise in the system is reduced. Which of the two options will be used in the final system is currently under investigation.
+The front-end module is complemented by two auxiliary boards. The power board (figure~\ref{fig:aux}, left side) houses switching and linear voltage regulators to provide all necessary supply voltages. Additionally, trigger (reference time) and clock signals are distributed to all front-ends from this board. Two ADCs allow for detailed monitoring of all voltages and currents. The board currently foresees two possible powering schemes: A 24~V (8 -- 36~V) input and DC-DC converters provide the most simple external supply. A second option is the direct input of externally regulated low voltages (1.1~V -- 3.3~V). In this case, only linear regulators are active and the electromagnetic noise in the system is reduced. Which of the two options will be used in the final system is currently under investigation.
-The second board is the data concentrator (figure~\ref{fig:aux}, right side). Built around a Lattice ECP3 FPGA, it serves as hub to connect all front-end modules to the central DAQ system. In the HADES configuration, this board runs a total of 13 links at 2 GBit/s using the TrbNet protocol. The reference time for all TDC is supplied by an additional LVDS signal generated by the central trigger system (CTS). A trigger request can be generated by the board as well: Every front-end can generate a signal based on a configurable combination of input signals or multiplicities and forward this via the concentrator board to the CTS, which in turn triggers the read-out of the full detector.
+The second board is the data concentrator (figure~\ref{fig:aux}, right side). Built around a Lattice ECP3 FPGA, it serves as hub to connect all front-end modules to the central DAQ system. In the HADES configuration, this board runs a total of 13 links at 2 GBit/s using the TrbNet protocol. This protocol allows to send trigger and busy information, event data and slow-control information in parallel on the same data link.
+The reference time for all TDC is supplied by an additional LVDS signal generated by the central trigger system (CTS). A trigger request can be generated by the board as well: Every front-end can generate a signal based on a configurable combination of input signals or multiplicities and forward this via the concentrator board to the CTS, which in turn triggers the read-out of the full detector.
-In the CBM experiment, data acquisition will not be triggered, but free-streaming. This can be achieved by altering the data processing scheme inside the network and endpoints while keeping the underlying network protocol the same. In this setup, also the clock distribution and fixed-latency synchronization messages will be embedded into the optical data stream to reduce the number of electrical connections inside the detector.
+In the CBM experiment, data acquisition will not be triggered, but free-streaming. This can be achieved by altering the data processing scheme inside the network and endpoints while keeping the underlying network protocol the same. In this setup, the clock distribution and fixed-latency synchronization messages will be embedded into the optical data stream to reduce the number of electrical connections inside the detector.
The 2 GBit/s link of the concentrator board is able to transport the data of up to 40 MHits/s from each module. As the CBM and PANDA detectors expect hit rates of up to 200 kHz per channel and 60 MHits/s per module in the central parts of the detector, this bandwidth is not sufficient. Here, an upgraded version of the concentrator, using a 4.8 GBit/s optical link will be employed.
\section{Summary}
A complete set of data acquisition electronics for single-photon detectors has been developed in a joint project between three large experiments at FAIR. The current development was based on experience gained during past test experiments with detector prototypes from all three groups.
-All parts of the data acquisition system employed make use of modules, protocols and software developed by the TRB community~\cite{trb-web}. That implies that already for the first tests of the DiRich module, a complete and proven toolchain for all aspects of control and data taking was existing and focus could be given to device specific tests.
+All parts of the data acquisition system employed make use of modules, protocols and software developed by the TRB community~\cite{trb-web}. That implies that already for the first tests of the DiRich module, a complete and proven tool chain for all aspects of control and data taking was existing and focus could be given to device specific tests.
First measurements in the lab of all modules have been completed successfully, full system tests are on-going and will be finalized during the next year.