Shift leader guide




Contents:

Introduction
Fill the logbook
Contact telephone number
Background
Survey ?
In case of repetitive DAS crashes ?
Cluster monitoring
In case of power cut or emergency ?
Big Brother : startup procedure
LEP coordinator page : LEP news
To know more on LEP RF cavities : what is available ?




Introduction:


Generalities:

Currently one of the 3 technical shifters is nominated as Shift Leader. (Since 1999, it has been decided that the shift leader would be either the DAS or the SC maestro). His duty is to coordinate the shift crew which must operate as a real team ! The principal responsability is to take decisions in order to optimize the data taking efficiency but also to be the leader of the group which must communicate. We see too often 3 shifters ignoring each other. The shift leader has to communicate with the 2 others shifters and install a good ambiance in the team. The shift leader has to survey the Background conditions and call the LEP Operator (77508-77510) if the run is paused more than 5 minutes because of background conditions. The shift leader has to fill up the Shift leader logbook which must be the reference one which is used for the data analysis. In case of powercut or emergency, the shift leader has the responsability to call all the people needed to recover the situation as fast as possible especially during data taking.





If the DAS is stopped or frequently crashing during stable beams conditions the shift leader will have to take some decisions in order to be able to run DELPHI sometimes without a detector.


If DELPHI is stopped for more than 30 mn for any reasons (magnet problems,water problems,gas problems etc ... etc ...) the shift leader has to call the RUN COORDINATOR 16 03 29 !!!!

Shift leader logbook



The shift leader has to write CLEARLY the LEP status and change of it ! Delphi people are often working from home (specially during the week end) and are connected to know what is going on with the pit matters at the moment : they must find all important things inside the shift leader logbook ! One has to write :
Unusual running conditions such as high machine background , high trigger rate , beam dump , trip of detectors
Special problems with detectors : Detectors removed from DAS ( totally or partially), problems of SC with detectors such as HV,LV or gas problems
Beginning/abnormal pauses/end if fill in using the usual options :

START OF SHIFT
START OF FILL
END OF FILL
LIST OF RUNS in fill ( at the end of fill)
DETECTOR SURVEY ( at the end of fill)


In fact the major interest of this logbook is to be able to understand any abnormal conditions of the detector even a few months later when analysing the data, don't forget that it is your responsability to give and write the information. Try to explain in the shift leader logbook the main reason of the lack of efficiency during the shift.

Contact telephone numbers

Here is the official DELPHI phone list in case of : contact telephone numbers





What to survey ?



Data Quality

The shift leader sould make sure that the QC shifter is looking at histograms and OED with necessary frequency and if possible should look through the full set of histograms at least once per shift with the QC maestro !

Trigger

The shift leader has to check the state of the trigger as shown by the monitoring display, the trigger rates have to be realistic :


Realistic T1 rates : around 600 -800 Hz

Realistic T2 rates : around 5-8 Hz at maximum depending of the LEP current !

Check once look up tables : Must be TRxx on Level 1 and TRyy on Level 2 in the Pythia Tables area (on station AXDETR), xx and yy are the numbers which can be found in the following Trigger documentation.

With this link to the DELPHI trigger documentation you should be able to know exactly what are doing the various DF functions at I and level II. Call the trigger on call in any case of trigger problems ! In case of trigger problems you can easily mask or prescale any DF functions in order to restart the DAS but you have to know what you are doing ...

Running efficiency

The trigger dead time is 2-3 % (complement to 100% in the Inst. Live Time on DAS Display ). If it increases above 5% you have to worry about , it can be :



Trigger rate too high ?

Data size too big ?


The shift leader has to understand the reason of the too high level dead time and call DAS coordinator if the reason is not understood and solved !

Helium level in case of quench (or anytime) :

The shift leader has to survey the helium level in case of magnet/quadrupole quench , the helium level has to be stable within 80% in the quadrupole/magnet part and 70% inside the Dewar : Helium level. Check also the "history" line of the page to check the stability !




Background and radiation monitors :


DELPHI is able to measure the rate of photons (BKG1) or electrons (BKG2 : off-momentum) : at any time during the fill, the shift leader must be able to understand the level of background and to react quickly if the conditions are too severe for DELPHI.

BKG1 and Maximum Silicium (TPC)=BKG3

We have 2 differents ways to measure the photons rates in DELPHI , the first one uses the number of TPC wires hit per second and is called BKG1. One take the maximum of the 12 TPC sectors to get some absolute number ( around 2000 ) which is normalized to 1 in normal conditions. This BKG1 measured by the TPC is sent to LEP as BKG1 (displayed on LEP screen and on the trace plots). The second way to measure the photon rates i to use the measurements done by the 8 small TPC silicon detectors which are located on both side of the TPC (4 on each side). The measurement of the maximum of the 8 detectors is then is normalised to the BKG1 value and is called BKG3 or MAX Silicium (TPC). This number is used to drive Big Brother who will take the decision to start/pause a run.

BKG1 :

Maximum of the 12 TPC sectors rate , one can see the 12 raw numbers on the MIG3 scaler (TX which is above the trigger control terminal). To check the numbers go inside MIG3 and check the 12 first numbers called NW_A_00 to NW_C_11 ( _A for A side and _C for C side). This 12 numbers are the rates per chamber and must be more or less at the same level ! If one of the numbers is more than 5 times more than the others then don't trust it and call the TPC shifter. Normal level must be with all numbers ( except scaler 0,1,2,9 which are disconnected because too noisy) around 2000 in normal conditions. Have a look at the following MIG3 plot showing you a a standard state when BKG1=1 (as 1/6/99) view of scaler MIG3.
BKG1 trace plot is shown on the XNDE25 terminal (red curve on the upper left trace plot) . Assuming that the TPC voltages are OFF when LEP is not in stable beams conditions , BKG1 must be 0 when LEP is filling or injecting. But TPC people did some work to get a BKG1 value when the TPC is on stand by voltage position (i.e 1150 Volts instead of 1435 volts).

The meaning of the BKG1 numbers are the following :


BKG1 = 1 means Normal conditions
BKG1 = 3-5 background is a bit too high , to survey but no worrying !
BKG1 > 7 Careful ! Big Brother should pause the run after a few minutes and lower the voltages in order to avoid TPC/OD/ID trips !
LEP should be called (77508-77510) if the run is paused more than 5 minutes !!! The shift leader has to explain gently but strongly the DELPHI background situation and ask for better situation , if the situation persists after the call the shift leader has to call again 5 minutes later !!! If no results after two calls , then the run coordinator has to be called even during the night


BKG3 = Maximum silicon detectors :

The maximum value of the 8 small TPC silicon detectors located on each side of the TPC is used to give the BKG3 value , this number is normalised to the BKG1 value and is called BKG3 but maybe not perfectly calibrated for high background value, one can observe differences between BKG1 and MAX-SILICON TPC (BKG3) specially for high BKG1 values. This detector is always ON and then always sending informations of the background to the display.
BKG3 trace plot is shown on the XNDE25 terminal ( upper left trace plot ) as MAX Silicium (TPC) in Yellow.
One can check the scalers to verify that the 8 numbers are at the same level , otherwise it can be some hardware problems which can be the source of strong discrepancies ! Check the MIG1 values : scalers 5 to 8 called TPC_SI_A1 to TPC_SI_A4 and scalers 13 to 16 called TPC_SI_A5 to TPC_SI_A8. If one of the 8 numbers if more than 5 times bigger than the others : Don't thrust it and call urgently the TPC shifter : this problem will trigger a high BKG3 value (displayed on the Trace plot as Max silicon (TPC) value) and Big Brother will pause the run. If you localise a scaler problem ( one of the eight much too high or 4 of them to crazy values 10 times more than the 4 other one ) : You have to disconnect Big Brother in order to be able to run ! The TPC expert on call (Yannick or Patrick if no answer) has to be called to solve the problem urgently !! Here is as example a standard view of MIG1 standard value when BKG3=1 (1/6/99). View of scaler: MIG1.

BKG2

BKG2 reflects the measurements of the off momentum electrons rate. The measurement is done by the VSAT. BKG2 value is the sum of the 4 small VSAT calorimeters normalized to standard value : BKG2 = 1-2 means normal conditions ...

BKG2 trace plot is shown on the XNDE25 terminal ( upper left trace plot ) as VSAT in blue.


A too high BKG2 value can be :

Some hardware problem with the VSAT , check the MIG1 scaler to verify that one of the 4 calorimeters is not counting like crazy, call the VSAT operator if too high discrepancies !


A vacuum leak in the LEP pipe causing large off-momentum electron rate ! Lep has to be called quickly is the diagnostic is confirmed (Call Patrick to get confirmation day/night) otherwise part of the detector can be dammaged !



If we get both BKG1(+BKG3) and BKG2 at very large and stable values : This is a strong argument to say : WE GET a VACUUM LEAK , don't hesitate in this case to call LEP to dump the beam !!


Radiation Monitors

This devices are sitting on the beam pipe 1.5 meters away from interaction region and are measuring the radiation rate that DELPHI get.
Radiation trace plot is shown on the XNDE25 terminal ( lower left trace plot ) as radiation.
This number has to be checked on the trace plot and the integrated dose since 24 h can also been checked on the radiation display : VD integrated Doses which can be found in the shift leader MENU. The dose has to be less than 10 rad per 24 hours which is actually a maximum.


Call LEP is you see big radiation spikes ( above 20 rad ) on the trace plot ! Ask a beam dump if the integrated dose is above 5-10 rad within a few minutes !!!

Handling of background by Big Brother

There are 3 instantaneous differents Background state : it depends of the following value which is the maximum of (BKG1,BKG3) = MAX(BKG1,BKG3)=MAXBKG (since july 1999) .

Background LOW when MAXBKG < 3.5

Background MEDIUM when 3.5< MAXBKG < 7

Background HIGH when MAXBKG> 7

Acoording to these last 3 states , an algorith calculates some integral of the background state : INTEGRAL = Period_of_time * MAXBKG and the background integrated state can either be SHORT or LONG (called respectively BackgroundShort or BackgroundLong) depending of the integration time : The integral is reset to 0 as soon as the background goes from HIGH to MEDIUM.


Since end of 1998 , the two integrated times are close to each other : 580 seconds for the state SHORT and 600 seconds for the state LONG just because we realized that pausing the run was not really necessary , according to TPC people we don't get any charge space up to a MAXBKG around 10 !


BackgroundShort LOW = Normal conditions

BackgroundShort MEDIUM = (MAXBKG is Background MEDIUM during 580 seconds) OR (during 120 seconds one gets INTEGRAL > 4060 )

BackgroundShort HIGH = (MAXBKG is Background HIGH during 580 seconds) OR ( during 120 seconds one gets INTEGRAL > 8120 )

Big Brother pauses the RUN. As soon as the background goes below 7 for at least 100 seconds (MEDIUM or LOW state), BB re-starts the run.


BackgroundLong LOW = Normal conditions

BackgroundLong MEDIUM = (MAXBKG is Background MEDIUM during 600 seconds) OR ( during 120 seconds one gets INTEGRAL > 4200 )

BackgroundLong HIGH = (MAXBKG is Background HIGH during 600 seconds) OR ( during 120 seconds one get INTEGRAL > 8400 )

Big Brother send a RESPOND to BACKGROUND, lowering the voltages to standby value (i.e 1150 V for the TPC instead of 1435 V) waiting for the background to go below 7 (MEDIUM or LOW state) for a least 120 seconds.


The status of these three backgrounds are displayed in the LEP status section of the central SC display. One can look at the INTEGRAL used to manage the SHORT/LONG states in typing DUI BCKG on a cluster machine.. Once the background conditions have returned to normal, following a "Respond to Background" command, Big Brother will issue a "Prepare for Run command", having first asked for confirmation.





What to do in case of repetitive DAS crashes ?


Statistics are only useful if the essential part of DELPHI are fully functionning. If however severe problems in a single detector are inhibiting data taking (i.e If crashing too often every 3 minutes or DAS stuck) it maybe sensible to remove it from DAS ( or from SC if a severe SC problem occurs). Here is the procedure to follow in order to be able to continue to take data if DAS is stuck or crashing a lot.

What is it possible to remove from DAS ?


IF DAS IS UNABLE to run with a DAS partition which is crashing often enough to substantially reduce the overall data taking efficiency the shift leader is allowed to take decisions in order to be able to continue to take data. This decision is really detector dependant as shown in the following table. The DAS maestro has to follow the DAS instructions to remove any CP : DAS instructions
These instructions are of course also the same in case of Slow control problems, the shift leader has to take the decision to continue data taking without a detector in case of strong Slow control problems which stop DELPHI more than 15 mn !

In the following table we present you the list of detectors with the possible action to be taken. First, the shifter has to follow the diagnostic procedure described in the DAS Hands On Guide in the chapter :"IN CASE OF ERROR". According to this procedure he must be able to know if the problem comes from 1 LES or 1 CP.

If not sucessful after the intervention of the DAS on call, based on the following table :

BLUE detectors : Follow the procedure to remove the faulty CP
GREEN detectors : Follow the proposed procedure and CALL the detector DAS expert
RED detectors : Call the detector DAS expert and the run and data coordinators.

If some detector is removed from GLOBAL SC or is in a "not ready" state , the shift leader has to take care of the Run related detectors which are necessary to Big Brother in order start a run. Run related detectors are the following : ID,TPC,OD,HPC,STC,TP. Don't forget that if there is any problem in the SC state of one of this detector ( Fastbus power supply off etc ... ). (Take care , HPC is in Run Related only at the beginning of the fill..._ Then BIG Brother will refuse to start the run. You will have first to remove the detector from Run related ( small green square to mask in the SC display ).

Table : List of detectors and action to be taken
DAS Partition Detector Number of CP/LES Possible action allowed
TP Trigger partition 1 CP + 1 LES Forbidden to remove !!!!!!! CALL DAS/TRIGGER EXPERTS
VD VD + PIXELS 1 CP PIX + 1 CP Barrel Forbidden to remove VD-CP (barrel) but allowed to remove VD-CPIX (Pixels) ! Allowed to remove VD-CP only if DAS is completly stuck !
ID ID Jet + Straws 3 CP + 1 LES Don't remove any CP from the configuration ! Call Expert and solve the problem ASAP , call run coordinator !
TPC0 TPC 12 CP + 1 TRIG0-CP + 1 LES Possible to remove any CP (TPC-CPxx) , but only 1 allowed at the same time, DON'T remove a trigger CP ( called TPC-TRIG0) Forbidden to remove a LES or more than one CP , call expert ASAP anyway !
TPC1 TPC 12 CP + 1 TRIG1-CP + 1 LES Possible to remove any CP (TPC-CPxx) , but only 1 allowed at the same time, DON'T remove a trigger CP ( called TPC-TRIG1) Forbidden to remove a LES or more than one CP , call expert ASAP anyway !
RIB0 Barrel RICH 1 LES + 4 CP Possible to remove any of the CP ! If severe problems , possible to remove the full partition when not possible to localise the problem in one single CP.
RIB1 Barrel RICH 1 LES + 4 CP Possible to remove any of the CP ! If severe problems , possible to remove the full partition when not possible to localise the problem in one single CP.
OD Outer detector 1 LES + 1 CP Possible to remove the full partition, call expert ASAP !
HPC0 1 half HPC 1 LES + 2 CP Not possible to remove any CP ( 1 quarter HPC lost ! ) Call Expert
HPC1 1 half HPC 1 LES + 2 CP Not possible to remove any CP ( 1 quarter HPC lost ! ) Call Expert
TOF TOF 1 LES Possible to remove the full partition. Call expert.
HAC HAC 1 LES Don't remove the partition ! Call expert and solve the problem !
VSAT VSAT 1 CP Possible to remove the partition , call expert ASAP !
STC STC 1 LES Don't remove from the acquisition , call expert + Patrick
FCH FCA + FCB 1 CP (FCA) + 2CP (FCB) + 1 LES Don't remove from the acquisition , call expert ! If DELPHI is stuck , possible to remove only the faulty CP (FCA or FCB)
RIF0 RIF 1 LES Possible to remove the full partition
RIF1 RIF 1 LES Possible to remove the full partition
EMF FEMC 1 LES + 2 CP Remove the full partition in order to be able to debug in local , call expert ASAP !
MUON Forward Muon + Barrel Muon 1 MUON-LES + 2 CP (forward) + 2 CP (barrel) Don't remove any CP of the MUON , call expert ASAP ! If DELPHI is stuck , remove the faulty CP (MUF or MUB)
MUS MUS 1 CP + 1 LES Possible to remove the full partition, call expert !




What to do after removing CP/Partition from DAS ?


When you remove a CP or a partition from DAS , you have to call the detector expert and tell him what is happening. The rule is to let the detector out for the rest of the fill but the expert has to come to see if there is no obvious solution to cure the problem (i.e : harware fault detected and spare module is ready)


Then , there are two solutions :


The removed detector/partition can be tested in local
The detector/partition which has been removed cannot run in local when the expert arrives but after replacement of hardware or significative action of the expert he is able to run in local. In this case, you are allowed to try once (only once ) to put it in Global , if it works : FINE ! if it doesn't work you have to ask the expert to wait for the end of the fill to try again !!!

The removed detector/partition cannot be tested in local (example : TPC CP which has been removed !)
In this case the expert has to come (even during the night ) to try to diagnostic the problem, he has to know that the CP/Partition will not be put back in GLOBAL before the end of the fill unless there is a very clear obvious problem which has been put in evidence by the expert. ( example : Power supply replaced or digitiser replaced)
I remind you the following instructions for the DAS maestro which are more precise for the latest one to be able to react properly : DAS instructions

Cluster monitoring

You have to survey what is going on with the cluster , specially if you feel that things are going slow on terminals or even if a detector is crashing DAS and that you cannot restart , it can be due to an AXP problem. Have a look on the AXDEMG station and follow the following instructions in order to identify the problem , call the system manager on call (day/night) if you feel that you are not able to solve the problem. Cluster monitoring

What to do in case of power cut ?

In case of power cut the shift leader has to help the SC maestro to follow the intructions. If the shift leader is the SC maestro , he has to be helped by the 2 others shifters to call people. if the shift leader is the DAS maestro, he has to help the SC maestro to go though the contact list for incident recovery : contact list
The origin of the powercut can be various (technician mistakes,EDF general cut,GSS alarm etc ...) but the shift leader has to understand the origin of the powercut before to put ON the FASTBUS PART , we already saw in the past 3 successives power cuts due to a stupid technician trying to understand some electrical circuits downstairs... The fastbus part has been put On/off in this case 3 times within 1 hour and a lot of damages has been done this day. So better to understand really the origin of the cut before to switch on all the fastbus part. The shift leader has specially to worry also a lot about the magnet part , the piquet has to be called quickly ! check also the HELIUM in the following LEP Page :Helium level. You can find on this last page the status of the Helium Level which has to be stable (70% inside the Dewar and 80% inside the quadrupoles and Cryostat of the magnet) , the history line allows you to trace the stability of the level in the last few hours !

To know more on Big Brother ?

All the commands involving the switching on of equipment (Prepare for Coarse Tuning and Prepare for Run) when issued by Big Brother, need a confirmation from the SC Maestro. At the moment this confirmation has to be given from the DAS Maestros display. Make sure he/she is aware of any reasons why the confirmation should not be given.



Big Brother Control

Big Brother is a high level of control that integrates DAS and Slow Controls and LEP. For example, the DAS data taking is stopped when certain slow controls problems occur, the volts are ramped down when LEP dumped the beams.

Big Brother - Normal Operation (as in SC guide)

Here is the normal procedure used by BIG BROTHER to start a fill : BIG BROTHER actions :
The following figure show you the 2 procedures : the top one has been used before 26/06/2000, and the bottom figure present you the actual setting :
BIG BROTHER will ask you CONFIRM at ADJUSTING to go to STAND BY (TPC=450 V) and a second CONFIRM when LEP will close the separators (set to 0) before closing the collimators. Then it takes 2 minutes to be ready in time for collisions.
Big Brother : START a FILL

To know more on LEP , news , plots , statistics


THe following link is there to give more news on LEP , the LEP coordinator page contains the minutes of LEP schedule meetings and a lot of plots on data taking from experiments :
LEP coordinator page

To know more on LEP RF cavities : What is available ?

If you want to know how is working the RF at the moment, you have to look at the WEB page : LEP page 151
On this page you will find 20 lines corresponding to the 4 interaction point RF units. There is 6 units at points 4 and 8 : corresponding to different cavity number (only SC cavities ) :
1) Units 3 (left side of IP4) = 431,432,433 at point 4
2) Units 3 (left side of IP8) = 831,832,833 at point 8
3) Units 7 (right side of IP4) = 471,472,473 at point 4
4) Units 7 (right side of IP8) = 871,872,873 at point 8
And 4 units at points 2 and 6 : corresponding to different cavity number ( SC and Copper) :
5) Units 3 (left side of IP2) = 232 and 233
6) Units 3 (left side of IP6) = 632 and 633
7) Units 7 (right side of IP2) = 272 and 273
8) Units 7 (right side of IP6) = 672 and 673
On each unit, you get in general 2 klystrons. And you get 4 cavities (SC) per klystron. Have a look at the page151 : On the page 151, you will find 20 lines for 20 SC units (6+6 (IP4/8) + 4+4(IP2/6)). On each line you get 1 or 2 klystrons depending of the unit called K1 and K2 ... so at total 36 klystrons ...
I present you as an example the 2 following pictures :
Page151 : Example


On the column : Heatr : You see if the klystron(s) are on/off HV : You get the voltages used by the klystron(s) *2 KV each K1 or K2 : In KW what is used by the klystron K1 (K2) MV1 or MV2 : The power in MV from the cavities (K1 or K2) The total is written at the bottom : Left for all cavities ( Sc + cupper) (556 MV) Right for total SC only ( here 528.1 MV) On the page 152, you will find the total for only cupper cavities : Page152 : Example

You will find the real page 151 and 152 at the following address to survey what LEP is doing at the moment : Page 151 (SC cavities) Page 152 (cupper cavities)

You have to know that the nominal value of the RF is 3650 MVolts (3.6 GV) getting in this case easlily 103.5 GeV per beam with 2 klystrons margin. The highest energy reached actually (20/6/2000) is 104.5 GeV without any reserve in klystron (3650 MV) but in this case the average energy duration time is of the order of 10 minutes (LEP is trying to increase this average in playing with RF voltages i.e lower the RF which are tripping often ... )


Last update on june 27 , 2000 by Patrick Jarry