LIMIT: A Cross Section Limit Calculator

This document explains the use of the program limit, which calculates cross section limits according to the prescription in DØ Note 2775a. Note that there is a simpler, web-based program available at http://www-d0.fnal.gov/~hobbs/limit_calc.html, which is suitable for many problems. The advantage of limit over the web-based calculator is that limit can include the effects of known correlations (between background expectations and signal efficiencies, for example), while the web-based program cannot. The main disadvantages of limit are that is significantly slower and requires more detailed input.

Note that one of the warnings for the web-based calculator applies to limit also: one of the approximations recommended in DØ Note 2775a is not good when the fractional error in the signal efficiency is more than about 25%. The result of the poor approximation is that the calculated cross section upper limit will be larger (i.e. the limit is less restrictive) than would result from the use of a more exact calculation.

Where to find the program

The program (executable image) is available both on the Alpha cluster and on DØ Challenges.

On the Alpha cluster: The executable image (limit.exe) is in the directory tmp$root302:[paterno.limit_program]. Source code, and a command file limit.lnk to compile and link the program, is in the subdirectory [.source]. This program also makes use of some libraries found in the directory tmp$root301:[paterno.lib], as noted in limit.lnk.

On the Challenges: The executable image (limit.x) is in the directory /prj_root/723/top_4/cs_limit. A sample input file is also in this directory, as is a script to compile and link the program. The source code is available in the subdirectory /source.

How to run the program

The program input is organized to make multiple limit calculations simpler. From the command line, limit reads the information about the signal efficiency, as well as the input that controls the range of integration and step size of the integration. The program also requires an input file, which contains information about the integrated luminosity for the data sample and also any number of background expectation values, and their uncertainties.

Format of the input file

The input file is organized as a series of lines, each of which begins with either a tag that labels the data on that line, or an exclamation point that indicates a comment. Note that both 'nsub' and 'back' can be used for specifying background expectations. 'nsub' has units of events; 'back' has units of cross section. Of course, one must not include the same background source both ways, or it will be subtracted twice.

! Number of events (candidates) observed
cand n 
! Counted background to subtract (measured in number of events, not
! in units of a cross section). 
! Format is: (value) (statistical error) (systematic error)
nsub x y z
! Integrated luminosity of data sample, statistical and systematic error
lum a b c
! Number of background sources which are to follow
nback m
! Information on background sources.
! The columns are (from left to right)
! 	production cross section
! 	systematic error in cross section
!	branching fraction to this final state
!	statistical error in branching fraction
! 	nominal acceptance
! 	statistical error in nominal acceptance
! 	acceptance with LOW "energy scale"
! 	acceptance with HIGH "energy scale"
back xs xserr bf bferr acc accerr acclow acchigh

Calling syntax

The command line syntax for the limit program is:

$limit input-file-name nom-sig-eff eff-stat-err low-ESCALE-eff high-ESCALE-eff smax CL name minstep

where the meanings of the arguments are the following.

A sample command file, which runs the program twice, for two different signal efficiencies, but with the same background calculations (as would be the case for a search that has a mass-dependent signal efficiency, but only one set of signal selection cuts) is given in the file sample_job.com. This file is also available in the same directoy as the limit program itself. This command file uses the input file sample_input_file.dat. This file is also available, in the same directory.

Output

The program produces text output to the screen (or to the batch job log file, if run in batch mode). An abbreviated sample of the output is given below, with explanatory notes. The whole output can be found in the file sample_job.log, in the same directory as the limit program. The output ntuple and histogram files from this job are also in that directory.

Sample output with notes

The program output is in black. Added notes are in red italics.

User PATERNO BATCH Job BATCH_320 started 5-DEC-1997 08:26:09.88
Running on file
IN$FILE this is the name of the input file; here, we used a logical name

Max cross section is: 100.0000 this is the value of smax
Number of samples is: 100000 the number of samples generated for numerical integration, not under user control
Ntuple name is: S000070 the name of the output ntuple. The file will be called s000070.nt4, in this example
Histogram file name is: S000070.hbook the name of the output histogram file
RZOPEN. record length: 8191 > maximum safe value (8191 words). irrelevant (and incorrect) warning from HBOOK
RZOPEN. You may have problems transferring your file to other systems or writing it to tape.
----> Opened HBOOK File : S000070 .nt4 this is the name of the ntuple file
with top-directory : //S000070

MAKE_NT finds ntuple size is 3 words
mean cross section: 0.0000000E+00 +/= 0.0000000E+00 don't worry about this
step size had been 0.0000000E+00, resetting to 5.0000001E-02 picobarns final step size for integrating overs cross section
Closing file for ntuple S000070
cross section limit: 7.450000 with confidence 0.9496512 this is the answer (cross section limit, and CL, which should be close to the requested CL value


The following table gives both the posterior density and its integral. These are the same as the values in the histograms in the .hbook file, except that only a few bins (one in every 50) are printed. The columns are (1) bin number (2) cross section for that bin (3) posterior density for that cross section (4) cumulative posterier probability for that cross section.

Note that the cumulative posterior probability has reached 1.0 long before we get to the end of the range of integration, and that the posterior density is down by many orders of magnitude from the peak value long before we get to the end of the range of integration. This is what a successful calculation will have as a result. If this is not the case, then please read the caveats.

1 0.0000000E+00 4.1103374E-02 4.1103374E-02
51 2.500000 3.6456413E-03 0.9485500 
101 5.000000 1.0836881E-04 0.9985719 
151 7.500000 2.8000923E-06 0.9999629 
201 10.00000 7.8052437E-08 0.9999993 
251 12.50000 2.4651894E-09 1.000000 
301 15.00000 8.8258220E-11 1.000000 
351 17.50000 3.5374802E-12 1.000000 
401 20.00000 1.5627079E-13 1.000000 
451 22.50000 7.4769758E-15 1.000000 
501 25.00000 3.8047603E-16 1.000000 
551 27.50000 2.0243446E-17 1.000000 
601 30.00000 1.1103240E-18 1.000000 
651 32.50000 6.2106174E-20 1.000000 
701 35.00000 3.5156505E-21 1.000000 
751 37.50000 2.0034867E-22 1.000000 
801 40.00000 1.1454468E-23 1.000000 
851 42.50000 6.5552300E-25 1.000000 
901 45.00000 3.7497511E-26 1.000000 
951 47.50000 2.1420430E-27 1.000000 
1000 49.95000 1.2933384E-28 1.000000 


mean background: 5.277243 +/- 2.139740 not important
mean lum*eff: 0.8359425 +/- 0.1758240 not important

Caveats

The limit program is somewhat finicky. Warnings for the user can be found at http://d0server1.fnal.gov/users/paterno/public_html/probability/limit_recipe/caveats.html.

Thanks

Special thanks go to Alexander Belyaev, who ported the program from the Alpha to the Challenge.


This page was last modified: 09/04/98 10:00 AM.

This page is kept by Marc Paterno (paterno@fnal.gov).