From 13b34be9d19ef9f9cc9a11ad7db872c82a6a170a Mon Sep 17 00:00:00 2001
From: Michael Traxler <M.Traxler@gsi.de>
Date: Fri, 9 Aug 2013 02:55:33 +0200
Subject: [PATCH] corrections and discussion, mt

---
 2013-nomeTDC-ugur-fpga_tdc/NoMeTDC_Ugur.tex | 243 +++++++++++---------
 1 file changed, 131 insertions(+), 112 deletions(-)

diff --git a/2013-nomeTDC-ugur-fpga_tdc/NoMeTDC_Ugur.tex b/2013-nomeTDC-ugur-fpga_tdc/NoMeTDC_Ugur.tex
index 595cd3e..805bf73 100755
--- a/2013-nomeTDC-ugur-fpga_tdc/NoMeTDC_Ugur.tex
+++ b/2013-nomeTDC-ugur-fpga_tdc/NoMeTDC_Ugur.tex
@@ -348,6 +348,12 @@
 \usepackage{hyperref}
 \usepackage{color}
 
+\usepackage{changes}
+%\usepackage[final]{changes}  % use this instead the first if you want to see
+%the final result
+\definechangesauthor[color=blue]{MT}
+
+
 % correct bad hyphenation here
 \hyphenation{op-tical net-works semi-conduc-tor}
 
@@ -388,10 +394,9 @@ Jan Michel\IEEEauthorrefmark{3},
 Manuel Penschuk\IEEEauthorrefmark{3} and
 Michael Traxler\IEEEauthorrefmark{1}}
 \IEEEauthorblockA{\IEEEauthorrefmark{1}GSI Helmholtz Centre for Heavy Ion
-  Research GmbH, Darmstadt â Germany}
-\IEEEauthorblockA{\IEEEauthorrefmark{2}Jagiellonian University, Krakow â
-  Poland}
-\IEEEauthorblockA{\IEEEauthorrefmark{3}Goethe-University, Frankfurt â Germany}
+  Research GmbH, Darmstadt, Germany}
+\IEEEauthorblockA{\IEEEauthorrefmark{2}Jagiellonian University, Krakow, Poland}
+\IEEEauthorblockA{\IEEEauthorrefmark{3}Goethe-University, Frankfurt, Germany}
 %\IEEEauthorblockA{\IEEEauthorrefmark{4}4th affiliation}
 }
 
@@ -414,11 +419,12 @@ In this paper the implementation of a 65 channel high precision
 Time-to-Digital Converter in a single Field Programmable Gate Array (FPGA) is
 presented. The TDC applies the interpolation method for time measurements. The
 precision of the TDC is increased with the Wave Union Launcher method. In
-order to overcome the minimum pulse width limitation a semi-asynchronous
-pulse stretcher is implemented and a pulse $<\sim500~ps$ is measured. The TDC
-has the typical precision of $7.2~ps$ RMS and the maximum error of $14~ps$ RMS
-on a single channel. Also the 264 Channel TDC Platform - TDC Readout Board,
-TRB3 - applying the described TDC is presented in the paper.
+order to overcome the minimum pulse width limitation a semi-asynchronous pulse
+stretcher is implemented which has been verified to allow a measurement of a
+pulse width $<500~ps$. The TDC has the typical precision of $7.2~ps$ RMS
+($14~ps$ RMS on the worst channel) on a single channel. Also the 264 Channel
+TDC Platform - TDC Readout Board, TRB3 - applying the described TDC is
+presented in the paper.
 \end{abstract}
 
 % IEEEtran.cls defaults to using nonbold math in the Abstract.
@@ -448,10 +454,11 @@ TRB3 - applying the described TDC is presented in the paper.
 \section{Introduction}
 One of the application areas, where Time to Digital Converters (TDCs) are
 widely used, is in particle physics experiments. These experiments constantly
-demand higher rates and more precise time measurements, therefore compelling
+demand higher rates and more precise time measurements, therefore the compelling
 constant development on the TDCs. Among many TDC architectures FPGA based TDCs
 gained more and moree importance in recent years, due to their high
-performance and reduced development time compared to ASIC TDCs.
+performance, higher flexibility to adapt to special neeeds of the
+application and reduced development time compared to ASIC TDCs.
 
 The review of the time interval measurement methods is given and discussed in
 detail in \cite{kalisz_review}. Many TDC designs based on delay lines are
@@ -461,12 +468,14 @@ achieved\cite{tdl1, tdl2, tdl3, tdl4}.
 \section{TDC Architecture}
 
 The TDC architecture is based on the interpolation method, as a long
-measurement range with high precision is needed. In order to achieve high
-precision fine time interpolator based on tapped delay line method (TDL) with
-wave union launcher \cite{wu_wul} is used. For the long measurement range
-double interpolation of coarse counter and epoch counter is applied. All these
-time information are written to a ring buffer as illustrated in
-\autoref{fig:tdc_arch}.
+measurement range with high precision is needed. In order to achieve a high
+precision a fine time interpolator based on tapped delay line method (TDL)
+with a wave union launcher \cite{wu_wul} is used. \replaced[id=MT]{The coarse counter
+  ($5~ns$ period) and an epoch counter ($10~us$ period) complete the time
+  measurement with a total range up to seconds.}{For the long measurement
+  range a double interpolation of coarse counter and an epoch counter is
+  applied.} All these time information are written to a ring buffer as
+illustrated in \autoref{fig:tdc_arch}.
 
 \begin{figure}[!t]
   \centering
@@ -478,19 +487,19 @@ time information are written to a ring buffer as illustrated in
 
 \subsection{Fine Time Interpolator}
 
-For the fine time interpolator TDL method is implemented \autoref{fig:tdl}, as
-the architectures of the modern FPGAs are well suited for the method. This
-method applies the intrinsic delays of the delay elements for time
-measurements. While the start signal propagates through the delay elements
-along the delay line, the rising edge of the stop signal, which has minimal
-skew (in theory $0~s$), samples the state of the delay line. With the location
-of the start signal and the intrinsic delay of a single delay element the time
-between the start and the stop signals can be calculated. As the intrinsic
-delay of the delay elements effects the bin width of the fine time
-interpolator, the shorter the intrinsic delay is, the higher the precision
-will be. Therefore we apply the carry chain lines in the FPGA, which are
-dedicated for fast arithmetic operations and have logic elements with very
-small intrinsic delays.
+For the fine time a interpolator TDL method is implemented \autoref{fig:tdl},
+as the architectures of the modern FPGAs are well suited for the method. This
+method \replaced[id=MT]{utilizes}{applies} the intrinsic delays of the delay
+elements for time measurements. While the start signal propagates through the
+delay elements along the delay line, the rising edge of the stop signal, which
+has a minimal skew (in theory $0~s$), samples the state of the delay
+line. With the location of the start signal and the intrinsic delay of a
+single delay element the time between the start and the stop signals can be
+calculated. As the intrinsic delay of the delay elements effects the bin width
+of the fine time interpolator, the shorter the intrinsic delay is, the higher
+the precision will be. Therefore we \replaced[id=MT]{make use of}{apply} the
+carry chain lines in the FPGA, which are dedicated for fast arithmetic
+operations and have logic elements with very small intrinsic delays.
 
 \begin{figure}[!t]
   \centering
@@ -499,25 +508,27 @@ small intrinsic delays.
   \label{fig:tdl}
 \end{figure}
 
-\footnotetext{modified from \cite{kalisz_review}}Another affect on the precision is induced by the skewness of the stop signal
-arrival time at the clock inputs of the flip-flops. This skewness should be
-minimal in order to keep the fine time interpolator bins at similar
-width and reduce the non-linearity. Hence the reason we use the clock signal
-as the stop signal, for the reason that, it is distributed over the clock
-lines, which are engineered by the FPGA developers to keep the clock skewness
-at minimum.
+\footnotetext{modified from \cite{kalisz_review}}Another effect on the
+precision is induced by the skewness of the stop signal arrival time at the
+clock inputs of the flip-flops. This skewness should be minimal in order to
+keep the fine time interpolator bins at a similar width and reduce the
+non-linearity. Hence the reason we use the clock signal as the stop signal,
+for the reason that, it is distributed over the clock
+\replaced[id=MT]{distribution network}{lines}, which are engineered by the
+FPGA developers to keep the clock skewness at minimum.
 
 
 \subsection{Coarse Time Interpolator}
 
-The coarse time Interpolator employs a coarse counter and an epoch counter,
-which are respectively 11-bits and 28-bits long. The coarse counter is triggered
-by the system clock, whereas the epoch counter is driven by the coarse counter
-overflow and together they increase the measurement range up to
-$\sim45~minutes$. The coarse counter information is written to the ring buffer
-for each hit signal, however in order to decrease the excessive bits in the
-data stream the epoch counter information is recorded, if and only if the
-coarse counter overflow occurs.
+The coarse time interpolator \replaced{consists of}{employs} a coarse counter
+and an epoch counter, which are \deleted{respectively} 11-bits and 28-bits
+long\added{, respectively}. The coarse counter is triggered by the system
+clock, whereas the epoch counter is driven by the coarse counter overflow and
+together they increase the measurement range up to $\sim45~minutes$. The
+coarse counter information is written to the ring buffer for each hit signal,
+however in order to decrease the \replaced[id=MT]{transported amount of
+  data}{the excessive bits in the data stream} the epoch counter information
+is recorded \replaced{only}{, if and only} if the coarse counter overflow occurs.
 
 Using the Nutt method \cite{nutt} the total time information of a hit signal
 and the time interval between two hit signals on different channels
@@ -544,10 +555,11 @@ parasitic reactances \cite{pelka_nonlinearity} the intrinsic delays along the
 carry chain are not uniform. This non-uniformity dramatically decreases the
 sensitivity of the TDL, if the TDL is latched, while the hit signal is
 propagating through an ultra wide bin (UWB). Suggested by \cite{wu_wul}, the
-sensitivity of the TDL can be increased by applying the Wave Union
-Launcher (WUL) method. Using the two transition WUL the maximum bin width is
-decreased to $35~ps$ from $45~ps$ (\autoref{fig:bin_wid}). The effect of the WUL
-on the time precision is shown in \autoref{sec:results}.
+sensitivity of the TDL can be increased by applying the Wave Union Launcher
+(WUL) method. Using the two transition WUL the maximum bin width is decreased
+to $35~ps$ from $45~ps$ \added[id=MT]{What about the mean bin
+  width?}(\autoref{fig:bin_wid}). The effect of the WUL on the time precision
+is shown in \autoref{sec:results}.
 
 \begin{figure*}[!t]
   \centerline{\subfloat[Traditional TDL method.]{\includegraphics[width=2.5in]{bin_wid_sing}%
@@ -560,32 +572,32 @@ on the time precision is shown in \autoref{sec:results}.
   \label{fig:bin_wid}
 \end{figure*}
 
-In order to induce two transitions into the TDL the hit signal is split in the
-FPGA. The delay lines between the split and the TDL must have the similar
-delays. Otherwise, either the first transition might overtake the second one
-and destroy the transition patter in the TDL or the second transition might
-take off much earlier than the first one and over flow the TDL
-for some hits. At this point very careful placement and routing in the FPGA
-comes into the equation.
-
+In order to \replaced{inject}{induce} two transitions into the TDL the hit
+signal is split in the FPGA. The delay lines between the split and the TDL
+must have \deleted{the} similar delays. Otherwise, either the first transition
+might overtake the second one and destroy the transition pattern in the TDL or
+the second transition might take off much earlier than the first one and
+\added[id=MT]{generate an} over flow \added{of} the TDL for some hits. At this
+point very careful placement and routing in the FPGA comes into the equation.
 
 \subsection{Semi-asynchronous Stretcher for Minimum Pulse Width Limitation}
 
 Some detectors used in particle physics experiments - e.g. Microchannel Plate
-(MCP) Detectors used for electron detection - can generate pulses as short as
-$1~ns$. These short signals cannot be measured by the traditional TDL based
-TDCs. In the traditional TDL based TDC concept the state of the start signal
-has to be preserved until the rising edge of the stop signal, as the falling
-edge of the hit signal would induce another transition in the delay line. In
-our case the hit signal has to conserve the logic high state until the rising
-edge of the next clock cycle. In order to guarantee this condition the width of
-the hit signal must be longer than the period of the clock.
+(MCP) Detectors used for \replaced{single photon}{electron} detection - can
+generate pulses as short as $1~ns$. These short signals cannot be measured by
+the traditional TDL based TDCs. In the traditional TDL based TDC concept the
+state of the start signal has to be preserved until the rising edge of the
+stop signal, as the falling edge of the hit signal would induce another
+transition in the delay line. In our case the hit signal has to conserve the
+logic high state until the rising edge of the next clock cycle. In order to
+guarantee this condition the width of the hit signal must be longer than the
+period of the clock.
 
 In our work a semi-asynchronous pulse stretcher is designed to extend the
 length of the hit signals more than one clock period. The demonstration of the
 stretcher is shown in \autoref{fig:stretcher}. The short pulse from the
 detector is connected to the clock input of the D-flipflop and with the rising
-edge of the hit signal the logic '1' at the D input is passed to the Q
+edge of the hit signal the logic '1' at the 'D' input is passed to the 'Q'
 output. After two stages of registers the signal is used to reset the
 stretcher, guaranteeing that the \emph{hit\_in} will stay high for two clock
 cycles. Afterwards the \emph{hit\_in} signal is connected to the TDL. The
@@ -606,13 +618,13 @@ the stretcher concept are discussed in the \autoref{sec:results}:
 
 The non-linearity across the fine time interval is induced by the non-uniform
 intrinsic delays along the carry-chain, as explained above. The maximum
-differential and integral non-linearities, that we measure are 2.74 and 9
-LSB respectively (\autoref{fig:non_linearity}). In order to calculated the
-effect of the non-linearity on the time precision a time interval is
-calculated with a constant quantisation step of $11~ps$ (calculated by
-dividing the clock period of $5~ns$ by the number of delay elements in the
-carry chain) and compared with the calculations done by using the real bin
-width values. The negative effect is calculated as a factor of $\sim2$.
+differential and integral non-linearities, that we measure are 2.7 and 9~LSB
+respectively (\autoref{fig:non_linearity}). In order to calculate the effect
+of the non-linearity on the time precision a time interval is calculated with
+a constant quantisation step of $11~ps$ (calculated by dividing the clock
+period of $5~ns$ by the number of delay elements in the carry chain) and
+compared with the calculations done by using the real bin width values. The
+negative effect is calculated to be a factor of $\sim2$.
 
 In order to overcome the non-linearity issue, we use the code density test
 \cite{codeDensityTest}. A large number of measurements, \textit{$H_T$}, for
@@ -646,22 +658,23 @@ in \autoref{sec:results}.
 \label{sec:results}
 
 Several tests were carried out in order to determine the quality and the
-limitations of the designed TDC. Some of these tests are the precision, mean,
+limitations of the TDC designed. Some of these tests are the precision, mean,
 minimum pulse width, dead time, calibration and temperature test.
 
-In order to test the precision quality of the TDC an LVDS pulse from a pulse
-generator was split using an LVDS splitter ($max9153$ $1~ps$ random
-jitter\cite{max}) and through coaxial cables fed to two different channels of
-the TDC. At least $300\thinspace000$ measurements were collected at stable
-environmental conditions for the offline calibration stage. The time interval
-between the rising edges of the signals was calculated as explained in
-\autoref{eq:tDiff} for the set of data and filled into a histogram. The time
-precision was measured by calculation the root mean square (RMS) of the peak
-without any curve fittings. 
-
+In order to test the precision \deleted{quality} of the TDC a LVDS pulse from
+a pulse generator was split using an LVDS splitter ($max9153$ $1~ps$ random
+jitter\cite{max}) and fed to two different channels of the TDC through a
+coaxial cables. At least $300\thinspace000$ measurements were collected at
+stable environmental conditions for the offline calibration stage. The time
+interval between the rising edges of the signals was calculated as explained
+in \autoref{eq:tDiff} for the set of data and filled into a histogram. The
+time precision was measured by calculating the root mean square (RMS) of the
+peak without \added{applying} any \replaced[id=MT]{cuts}{curve fittings}.
+
+\replaced[id=MT]{This is well known and should be ommited}{
 Please note that the RMS values in the time difference histograms are the
 errors of the two channels and one should divide the value with $\sqrt{2}$ in
-order to find the time precision of a single channel.
+order to find the time precision of a single channel.}
 
 The precision measurements were repeated for the designs with and without the
 WUL and the results are given in \autoref{fig:precision}. The time
@@ -723,7 +736,7 @@ considerably deteriorated ($12~ps$).
 \end{figure}
 
 For dead time measurements two pulses were generated in burst mode and the
-readout of the TDC was triggered using Tektronix Arbitrary Waveform Generator
+readout of the TDC was triggered using a Tektronix Arbitrary Waveform Generator
 (AWG7122C). The time interval between the leading edges of the two pulses were
 adjusted with the AWG during the measurements and the number of hits per event
 was analysed. For the time gaps, dead time, $15~ns<t_{dt}$ both of the hits are
@@ -774,27 +787,27 @@ after a couple of degree $^\circ$C, the time precision starts to get worse.
 \section{TDC Readout Board - TRB3}
 
 Using the above mentioned research an FPGA based TDC Platform with 264
-channels is developed to be used in various particle physics experiments and
-medical imaging - TRB3\cite{traxler_trb3}. The platform consists of 5 Lattice
-ECP3 FPGAs each consisting of $150~K$ logic elements. The 4 peripheral FPGAs
-are used as TDCs each containing 65 channels. First channel of each TDC
-digitises the arrival time of the common trigger signal. This way all TDC data
-can be synchronised, if a bigger system, which contains more than one TRB3, is
-used. The central FPGA is used for data concentration, data transfer over
-either GbE or optical links, slow-control and trigger processing. The central
-FPGA also includes 4 extra TDC channels in order to measure the asynchronous
-trigger signal from the detector. Each TDC channel has a maximum of $66.7~MHz$
-burst hit rate and the maximum readout trigger rate of the system is
-$700~kHz$. Thanks to the standard GbE protocol implemented in the central
-FPGA, power supply and an Ethernet cable is the only necessary equipment for
-data taking.
-
-For this platform various front-end electronics (FEE) are developed and are
-being developed for Time-of-Flight (ToF), Time-over-Threshold (ToT) and charge
-measurements. One low cost FEE developed for the ToF and ToT measurements of an
-MCP detector for the PANDA experiment uses an FPGA for setting the thresholds
-and the internal LVDS buffers for signal
-discrimination\cite{ugur_padiwa}. With this setup a time precision of
+channels \replaced{was}{is}developed to be used in various particle
+physics experiments and medical imaging - TRB3\cite{traxler_trb3}. The
+platform consists of 5 Lattice ECP3 FPGAs each consisting of $150~K$ logic
+elements. The 4 peripheral FPGAs are used as TDCs each containing 65
+channels. First channel of each TDC digitises the arrival time of the common
+trigger signal. This way all TDC data can be synchronised, if a bigger system,
+which contains more than one TRB3, is used. The central FPGA is used for data
+concentration, data transfer over either GbE or optical links, slow-control
+and trigger processing. The central FPGA also includes 4 extra TDC channels in
+order to measure the asynchronous trigger signal from the detector. Each TDC
+channel has a maximum of $66.7~MHz$ burst hit rate and the maximum readout
+trigger rate of the system is $700~kHz$. Thanks to the standard GbE protocol
+implemented in the central FPGA, power supply and an Ethernet cable is the
+only necessary equipment for data taking.
+
+For this platform various front-end electronics (FEE) \replaced{were}{are}
+developed and are being developed for Time-of-Flight (ToF),
+Time-over-Threshold (ToT) and charge measurements. One low cost FEE developed
+for the ToF and ToT measurements of a MCP detector for the PANDA experiment
+uses an FPGA for setting the thresholds and the internal LVDS buffers for
+signal discrimination\cite{ugur_padiwa}. With this setup a time precision of
 $\sim17~ps$ RMS is achieved. Another FEE for charge measurements for the
 electromagnetic calorimeter detector of the HADES experiment is being
 developed. The FEE integrates the input signal with a capacitor and discharges
@@ -804,6 +817,10 @@ measured using the FPGA TDC. The prototype of the development has proven a
 $0.2\%$ charge precision.
 
 
+\added[id=MT]{Precision vs. Resolution: Nothing to be discussed in a paper,
+  this is a discussion with the referees, I think. But the information at the
+  end, that the FPGA-TDC would have a resolution of 35ps, and a precision of
+  <10ps is intersesting. Sill I would not put it in to avoid confusion.}
 \section{Precision vs Resolution}
 \label{sec:pre_vs_res}
 In most of the literature about the TDCs, which are based on statistical
@@ -814,6 +831,7 @@ this paper, we believe, the correct word for the figure of merit for quality
 should be ``\textit{precision}'' for the TDCs, which are based on statistical
 approach.
 
+
 ``The property of the set of measurements of being very reproducible
 or of an estimate of having small random error of estimation.'' is given as
 the definition of precision in the web site of the \textit{Organisation for
@@ -839,11 +857,12 @@ case.
 In this paper the architecture of a 65 channel TDC implemented in a single
 FPGA is discussed in detail and the test results displaying some of the
 quality measurements are presented. The TDC has a maximum precision of
-$7.2~ps$ RMS and $<14~ps$ RMS on all channels. The solution for the short
-pulse limitation of the conventional TDCs and the non-linearity correction is
-analysed. Moreover, a 264 channel TDC platform - TRB3 - based on the FPGA TDC
-design is disclosed. At last the terminology for the statistical approach based
-TDCs is argued in \autoref{sec:pre_vs_res}, ``\textit{Precision vs Resolution}''.
+$7.2~ps$ RMS and $<14~ps$ RMS on all channels. \added[id=MT]{The last sentence is
+  unclear to me.} The solution for the short pulse limitation of the
+conventional TDCs and the non-linearity correction is analysed. Moreover, a
+264 channel TDC platform - TRB3 - based on the FPGA TDC design is
+disclosed. At last the terminology for the statistical approach based TDCs is
+argued in \autoref{sec:pre_vs_res}, ``\textit{Precision vs Resolution}''.
 
 
 
-- 
2.51.0