Risk Assessment for CEM Splitter Installation of the EMTiming System during the summer 2003 Shutdown

 

Dave Toback

 for the EMTiming group

(7/9/03)

 

The EMTiming system, to instrument the EM calorimeter for timing readout, is a CDF, DOE and Italy approved Run IIb project with all of its funding secured. Pre-production prototypes are currently working on the detector, and the full production components of the system are on schedule to be completed and tested by the summer 2003 shutdown. The EMTiming group, with the endorsement of the Exotics, Photon and Calorimeter groups, is proposing to install the entire system during that shutdown. The physics justification, as well as the test-bench and on-detector test results for the project, can be found at:

http://hepr8.physics.tamu.edu/hep/emtiming/

 

In this note we present a risk assessment to address concerns about the installation of the CEM splitter system[1]. A full installation and checkout procedure, based in part on this assessment and designed to minimize and mitigate the risks, has been reviewed and approved by the calorimeter SPL’s and Dervin and can be found on the above web page. We believe the installation should proceed contingent on this procedure.

 

We begin our understanding of the current mandate CDF/FNAL mandate and the risks. We then detail our risk assessment and how it is to be compared to the current risk tolerance levels.  In the last section we summarize our responses to the questions of why install the CEM now as opposed to waiting until the Run IIb shutdown or just doing the PEM now.

 

CDF/FNAL Mandate

 

The current mandate from the collaboration and lab is effectively:

 

a)     We should not modify/change a working detector unless there is a compelling reason (physics justification) to do so or it is absolutely necessary. The current goal is for the whole system to work and for us to have as large a live time and efficiency as possible.

b)     For the summer shutdown (as with any shutdown) the goal is to minimize the possibility that we come out of the access and need to go back in to fix anything.

Unless there is a push from the physics groups the risk should be minimized and it is inappropriate to attempt any new installations. However, there is a strong physics push from the Exotics and Photon groups with wide spread agreement that the physics case is strong and compelling, and the leaders of these groups feel that, contingent on a satisfying procedure, that the physics priorities dictate that we install as quickly as safely possible.

 

The different types of risk:

 

       I.     The splitter system does not function as expected on the real detector and needs to be removed.

     II.     That damage is done to the detector during installation, and that the breakage is either time consuming to fix or is irreparable.

 

We address these individually:

 

I) Splitter does not function as expected: The splitter functionality has been studied in great detail. All evidence shows that the splitter works as advertised both on the test bench as well as on detector itself (on CEM West 0 and West 23, Tower 9) during data taking. To further mitigate any risk each splitter is connectorized using LEMO connectors such that if there is any problem we can simply unplug the system and put it back the way it was originally. For the two most easily accessible wedges (say 17 and 18), it will take longer to get downstairs then it will to unplug them. For the entire system, it would probably take a couple of days to unplug everything. There is a risk that there is a problem that is not discovered right away. In this case, we lose the data from those wedges or a lot of work is required to salvage the data. However, studies of 2 months of data taking have shown no such problems on the channels that are currently instrumented.

 

II) Damage is done to the detector during installation: There is widespread consensus that during installation the probability that something will get broken, as well as its severity, is most affected by the training and carefulness of the people doing the work. Looking back at the last cable installation, it was felt that people doing the installation were not fully aware of how easy it was to damage to the CHA laser fiber system, and there were a large number of channels that needed to be fixed afterwards. To minimize this risk we have a detailed installation and checkout procedure, and it is agreed that only people approved by Dervin and trained on the test wedge on the 1st floor will do the installation, with TAMU students and post-docs only being gophers and participating in the system checkout. To maximize the probability of finding any problems and getting them fixed before the end of the shutdown, we propose that the splitter harness installation begin as soon as possible after the beginning of the shutdown, and we do checkout after every half-wedge (check, fix and re-evaluate as we go).  However, to be clear, even if we start at the beginning, the installation needs to go at a pace that is dictated by safety of the detector, not the total amount of installed hardware. For this reason, the system is designed to be very modular, so partial installations are possible and still useful. To further reduce the risk we have designed the splitters so that they are unlikely to catch on other wires/fibers during installation, and we will run the splitter lines down the along the metal plate (again see the Installation Procedure document). We continue with more details of the installation itself, components that might get damaged during the process as well as the severity, time to fix and mitigating factors if any.

 

Installation Tasks:

 

Since the primary concern is the installation risk, it helps to begin with a description of the installation of a CEM splitter harness and list the components that could be affected (again see http://hepr8.physics.tamu.edu/hep/emtiming/ for more details). The harness effectively has three components: 1) The piece which is used to connect to the PMT base, 2) the cable which runs from each PMT to the edge of the CEM wedge (each PMT location is cut to length), 3) the bundled harness which goes from the edge of the wedge along the cable trays to the readout racks.  The installation is accomplished in a number of steps. We begin by laying the cables along the wedge itself by unrolling the cable bundle from the edge of the wedge and carefully running the individual lines along the metal plate on the wedge to their respective PMT bases. When each is in place, we attach them to the bases. We do this for both sides of the wedge (with checkout in between) and fasten with cable ties attached to the metal plate. We then bundle the two sides together and run them together along the cable tray. The harnesses from three wedges are joined at the end and run together to the relay racks. The harnesses themselves would be tied down on top of what is there or tied to the cable tray itself. Finally, the cables are dressed at the relay rack.

 

Dervin believes that a 2-person crew can install about 3 wedges/day with separate additional time to run the cables and dress them at the relay racks. For the top wedges where you need Monkey bars you can only have 1 person doing work at a time, and when you need a Genie you have to come down before you can move the Genie and you will probably need to move 2-3 times per wedge. In these cases, as well as the ones on the bottom, we estimate 1 days/wedge assuming we can still get to them.  Installing on the arches that don’t need Genie’s will mitigate much of the time constraints assuming we have access to them.

 

 

CEM components that could be damaged and notes on the systems of most concern:

 

The parts on the CEM that could be affected are[2]:

 

-        Fibers: Quartz fibers for Hadron laser system, quartz fibers for the CEM Xenon flasher system, and plastic fibers for the CEM LED system.

-        Phototubes and bases

-        HV distribution box (Pisa Box)

-        Shower max crate

-        Power supplies for shower max (rabbit supplies)

-        Cabling: Shower max, preshower, crack chambers, CEM and had calorimeters readout.

-        Controller box for the source runs.

 

The dominant concern is the fibers for the CEM and CHA. The LED, Xenon and Laser systems are each separate and has a different set of problems. We note that each of these systems is primarily used for monitoring and debugging. While these calibrations are used only indirectly in the final calibrations, the loss of these fibers means we can't calibrate without data. However, except in emergencies, final calibrations are done with data

 

1.     CHA Laser fibers: These are quartz fibers, so they are very thin, fragile, and largely un-fixable if broken because there are only a few spares left after the massive fix job from the Run IIa installation. Unfortunately, the fibers are exposed near the face of the phototube and are vulnerable from their run from the phototube to the “angle bracket.” Many of these were broken during the main cable installation, mostly because (according to what we’ve heard) people didn’t know they were a problem and cables were run along the PMT bases. Currently, there are none broken in the CHA (about 4 broken in the WHA).  To mitigate this risk, our installation differs from the Run IIa installation in that we are running the splitters along the metal plate on the base side of the PMT that is away from all the fibers. Even so, Dervin conservatively estimates a 20%-30% chance of breaking a fiber per wedge during installation which estimates about 0.3*48≈15 more broken by the time we are done. Fotis prefers not to estimate, but rather stresses that the number of fibers broken is most dependent on the carefulness of the installers. Since this is to many the dominant risk/concern we paraphrase Fotis’ his view as the maintainer of the laser system.

 

He believes (as previously stressed) that we should go ahead with plans to install the system, however before doing so there needs to be real preparation. Specifically, the technicians doing the installation need to be trained as to where the fibers are and how to avoid breaking them (he will help). Also, during installation they need to do the job slowly and carefully and check after each ½ wedge to make sure that their installation technique isn’t breaking anything. If after the first ½ wedge we see breakage, either we fix the installation technique or we stop the installation completely.

 

This view forms the basis of our installation philosophy and we note that problems with the CHA laser system are easily identified using the laser fiber calibrations.

 

2.     CEM LED fibers: These are fairly robust as they are plastic fibers, but are the most likely to get bumped as they are the most exposed in their run to the face of the CEM PMTs. In particular, the fiber for PMT-6 sits on a corner of the LED box. There are currently about 26 of these broken, half of which are PMT-6, many of which were left over from Run I. There is a plan to fix some, if not all of them, this summer. Bob Wagner estimates about ~1/2hr per fix. He thinks that we can make a box (cardboard?) that covers the LED fiber box that would help protect them during installation, although Dervin worries about doing damage putting on the box. We note that a further mitigating factor is that, as in the CHA fiber case, none of the splitters will be run near the PMT faces (where the fibers go to the PMT), nor will any splitters run past tower 9 where the LED box is located. Furthermore, the LED calibrations will quickly identify any problems so they can be quickly fixed. Given these mitigating factors as well as how straightforward and quick these fibers are to fix, Bob Wagner easily considers this an acceptable risk. 

 

3.     CEM Xenon flasher fibers: While these are very thin, fragile quartz fibers and (as far as we know) un-fixable if broken, they are well protected and hard to break. The fibers for each wedge are located in the middle of the wedge, coiled up and placed on top of the steel plate, but below an aluminum covering. With this protection only about 4 inches are exposed, so none were broken during the last installation, and there are none currently broken. Furthermore our splitter lines are not anywhere near them. Any problems would be quickly picked up by the Xenon calibrations and Steve Hahn considers this a minimal risk and a non-issue.

 

4.     Shower max cables:  These are sensitive cables and if bumped sufficiently will need to be reseated. Dervin estimates there is a 20%-30% chance per wedge that we will bump these cable at this level. The Shower max calibrations are set up to check for this, so it is straightforward to find these problems and while a pain, each is fixed in about 1/2 hour. The cables from the wedge to the Shower max crate are less vulnerable, but the problems are harder to detect. To ensure we find problems quickly we will run cosmics overnight and analyze the data as was done during the Run IIa commissioning.

 

5.     Readout crates and the path to the crate (at 2 and 4 o'clock.): While the probability of damaging something during installation of the cables from the edge of the wedge to the rack is small, some notes are in order. The space is really tight as there are trigger cables, calorimeters and muon cables. Rob and Dervin are not worried so much about breaking stuff, but Rob is mostly concerned about loosening the connections or breaking a marginal cable. This can be checked to see that nothing is unplugged or broken using calibrations. All cables from a wedge are bundled together and run along the cable tray. While Rob thinks we will need to undo the bundles and put ours in, Dervin thinks we can tie ours down on top of what is there and/or tie them to the cable tray. While we need to dress the cables at the relay rack, Dervin is not concerned about disturbing anything there. (To quote Dervin: Getting them from the wedges to the racks is the easy part. Feeding from the focal point at the wedge to the PMT's is the hard part and the risky part).

 

6.     Other problems: On the rest of the wedge there is the possibility of breaking a base, stepping on a cable or damaging one of the other items listed above. These are considered low probability items, and are all fixable, with the time required being dependent on the type of problem.

 

Bottom line for a Dervin trained/supervised crew doing the work: 99% chance that we will bump things. Damaging something that would need to be repaired is about 30%. Damaging a fiber that is beyond repair is about 20%-30%. Damaging something else beyond repair or that would take a long time to repair is very low, ~5%.  There is a debate in the calorimeter group as to whether these numbers would be smaller if the installation were done with the detector out of the hall. Certainly everyone agrees it would be easier since there is better access and bigger lifts are available, however, with many people having easy access to the calorimeter breakage potential may in fact be higher.

 

Questions, Alternatives and Summary:

 

1) Why install the CEM now? The physics should always drive the detector priorities and the dominant sensitivity comes from the CEM with as much data as possible. We already have installed and working versions of the CEM all the way from the PMT to the TDC crate, and all the production components are on schedule to be ready for the summer; we are well prepared for the full system. To optimize the physics capabilities we should install the components with the best sensitivity as early as possible. The alternative, not installing this summer, this would mean that we lose all the data until the next (as yet unscheduled) shutdown and there is no compelling reason to believe it would change the installation risk (unless we install during the Run IIb shutdown when the detector is out which is described below). Meanwhile the equipment would simply sit and we may lose the opportunity of having Italian techs to help with the installation. Furthermore, since the amount of data taking up until now has been smaller than anticipated, the final fraction of the non-EMTiming data would be minimized which is a huge advantage over having the data split into two large and potentially separate datasets: with EMTiming and without EMTiming. This should be avoided if at all possible.  The bottom line is that there is broad consensus that the system should be installed at some point and the decision before us is when. The Exotics group and we believe the answer is as soon as possible.

 

2) Why not wait until the detector rolls out to install and install then since it’s less risky? Wasn’t this a Run IIb project in the first place?  It is true that this was originally a Run IIb project. However, it is ready now and the compelling physics case for upgrading the detector sensitivity provides a great opportunity. It is also true that installation would be easier and, arguably less risky, during a shutdown where the detector is rolled out. However, the detector is not scheduled to come out again until the Run IIb shutdown (2006?) and recent indicators give us reason to doubt that the detector will ever roll out and back in again. We have the ability to install it now so in the meantime instead of just sitting for 3 years we could, using Run IIa data, make a number of important searches, and potential discoveries. Also the Run IIa and IIb EM calorimeter data, for such things as the W-mass measurement, would be much more easily combined.

 

3) Since the PEM is safer, why not just install the PEM and long cables this shutdown and do the CEM during the next shutdown? Wouldn’t doing the PEM first ensure that it gets properly maintained and provide a good warm-up? There is no argument that it would be safer to install less, and install the less risky part. However, physics arguments push for the CEM installation and there needs to be a balance/weighing of the two considerations. The issue is if the installation procedure is satisfying and if the human resources and scheduling allow for the work to begin. We already have working versions of the CEM and PEM cabling installed on the detector so we are well prepared for the full system. An installation and checkout procedure is in place. The readout and monitoring software for both systems is in place and functioning well. There is no compelling reason believe that the PEM would get ignored if we did both CEM and PEM since, as with the HAD system; the searches we want to do would benefit from direct photon timing, but there are large additional advantages to reducing the MET-measurement pathologies with a global EM energy understanding at all eta in the calorimeter (and beam halo can affect all eta). While it is true that having less to do should in principle reduce the risk, as mentioned above, if the installation is done at the pace of detector safety as the top priority and/or with enough people to finish it all, it shouldn’t matter whether the plug is being installed. Again, if we don’t get all of the CEM done, any CEM components are much more directly sensitive for physics than the PEM.



[1] There is currently no objection to the installation of the PEM.

[2] Pictures of all of these components as well as their locations within the wedge can be found in the installation document on our web site.