Case:Tiramisu

From EpiDataWiki

Jump to: navigation, search

Enteritidis gastroenteritis following tiramisu consumption at a high school graduation ceremony, Germany, June 1998: Using EpiData in an Outbreak Investigation



Aknowledges:

We thank Dr. Anja Hauri and EPIET for allowing us the use of these data as the subject of this training material.

Development Team
Instructional Designer and Adapter: Pedro Arias, MD, MPH, FETP Spain
Epi Data Technical Advisors: Jens Lauritsen, MD, EpiData Asociation, Denmark

Learning Objectives and Notes


  • Understanding how to organize Data Analysis in an Outbreak Investigation
  • Understanding how to describe an Outbreak in terms of: Time, Place, Person
  • Understanding basic concepts about analytical studies: Retrospective Cohort Study
  • Understanding the utility of Software during Outbreak investigations


  • Warning: You will need to download and unzip this sample file.
  • NOTE:
 This instruction is based on the study: 
 Salmonellose-Ausbruch: kontaminierte Desserts auf Abiturfeier
 Beispiel einer epidemiologischen Ermittlung (›retrospektive Kohortenstudie‹)
                            By
 Hauri, A M(1), Laude G (1), Steinitz H(2), Maaßen,S(2), Effelsberg W(2), Ammon A(1), Kist M(3),  
 Tschäpe H(1),Petersen L(1,4)
 1 Robert Koch-Institut, Berlin, Germany
 2 County Administration Breisgau-Hochschwarzwald - Public Health Office, Freiburg, Germany
 3 Institute for Hygiene and Microbiology, University of Freiburg, Germany
 4 Centers for Disease Control and Prevention, Atlanta, U.S.A.


Contents

Introduction

(The following is an excerpt from the report of the original study)

On 26 June 1998 the St Sebastian High School in Stegen, Germany, celebrated the graduation from school by organising a party to which 250 to 350 participants were expected. Attendants included graduates from St Sebastian High School, their families and friends, teachers, 12th grade students and some graduates from the nearby Marie-Curie school of Kirchzarten.

A self service party buffet was supplied by a commercial caterer from Freiburg. Food was prepared the day of the party and transported in a refrigerated van to the school.

Festivities started with a dinner buffet open from 8.30 pm and followed by a dessert buffet offered from 10 pm. The party and the buffet extended late during the night and alcoholic beverages were quite popular. All agreed it was a party to be remembered.


  • The alert

On 2nd July 1998, the Freiburg Health office of the Federal Council Office of Breisgau-Hochschwarzwald reported to the Robert Koch Institute (RKI) in Berlin the occurrence of many cases of gastroenteritis following the graduation party described above. More than 100 cases were suspected among participants and some of them were admitted to nearby hospitals. Sick people suffered from fever, nausea, diarrhoea and vomiting lasting for several days. Most believed that the tiramisu consumed at dinner was responsible for their illness. Salmonella Enteritidis was isolated from 19 stool samples.

The Freiburg health office sent a team to investigate the kitchen of the caterer. Food preparation procedures were reviewed. Food samples, except tiramisu (none was left over), were sent to the laboratory of Freiburg University. Microbiological analyses were performed on samples of the following: brown chocolate mousse, caramel cream, remoulade sauce, yoghurt dill sauce, and 10 raw eggs.

The Freiburg health office requested help from the RKI in the investigation to assess the magnitude of the outbreak and identify potential vehicle(s) and risk factors for transmission in order to better control the outbreak.

Cases were defined as any person attending the party at St Sebastian High School

who suffered from diarrhoea (>= 3 loose stool for 24 hours) or

who suffered from at least three of the following symptoms: vomiting, fever>= 38.5 ° C, nausea, abdominal pain, headache

between 27 June and 6 July 1998.

Students from both schools attending the party were asked through phone interviews to provide names of persons who attended the party.

Overall 291 responded to enquiries and 103 cases were identified (Attack rate = 35%). Among these cases, 84 received medical treatment and four were admitted to hospitals. Attack rates by age group were 36.6% for persons < 20 years, 32.1% for persons 20 to 29 years, and 36.8% for persons older than 29 years.

Most cases occurred between 27 and 29 June with an early peak from June 27, 0 am until June 28, 6 am.

Below is an example of the first page of the paper questionnaire administered to the attendees to the Ceremony (Some comments explaining the meaning of questions have been introduced).

Example of questionnaire page

In this case study we will assume that the whole process of developing the questionnaire, checking data quality and cleaning the dataset have already been done and we will focus on the data management aspect during the analysis phase and the analytical, statistical and epidemiological aspects of the outbreak.

Starting a New Project

  • Create and name a new folder

Create a new folder by right-clicking anywhere, choosing NEW, and then FOLDER. A new folder will appear. The blinking cursor inside the title of the folder indicates you can type in a new name. (If you do not see a blinking cursor, you can right-click on the new folder, select Rename, and then type the new name). Name the folder with name you want, but remember is a good idea to use meaningful names “Tiramisu Investigation” is a good name.

You can create a folder for your investigation anywhere you like on your computer; you do not have to create one in the same drive as the EpiData program files. However, it will make it easier for you to remember where the files are located, if you use always a similar system, for example creating all the FOLDERS under “Your Name” folder.

Hints Always create a separate folder for each project.


  • Unzip necessary files

In order to work with this exercise you need to UNZIP this file: Tiramisu.zip in your project folder.

Opening and preparing the dataset

  • 1) First we need to open the dataset: Read the data file.
a) Click on the READ DATA button and search your folder and file.
b) Select TEIL.REC
c) Click Open

Epidata will show some messages informing you about the data field: name, number of records number of field, etc. In this case we have 291 records in the dataset.


Hints Remember, you must always check if the number of records you are working with are the expected number of records.

  • 2) Changing labels of variables and Values

Some times when you start analyzing your dataset (or a dataset that anyone else has created) you realize that the name of variables are not that meaningful (in our case because they are in German). Maybe the software used to create the dataset didn’t allow adding labels to the variable or to the values. In those cases it is important to spend some time preparing the dataset.

We are going to add meaningful labels to the Variable name and to the Values.

To add (or change) labels to the variable name you have to use the command: LABEL

In the command prompt write:

LABEL DUNKLE “Dark Mousse au Chocolat”

Here LABEL is the command’s name, DUNKLE is the name of the variable we want to add the label to and “Dark Mousse au Chocolat” is the Label itself.

To add labels to the values of DUNKLE you have to use the command: LABELVALUE

In the command prompt write:

LABELVALUE DUNKLE /1=”Yes” /2=”No” /9=”Don’t know”

Here LABELVALUE is the command’s name; DUNKLE is the name of the variable to which we want to add labels to its values and then you have to specify which label corresponds to each value.

Finally we are going to define a user’s missing value. When you are entering data if you keep a variable without data it is called a missing value (a system’s missing value). It is represented by a dot “.” Some times you want to define as missing values some of the values you have entered in the dataset, for example 9 (Don’t know) or 99 (Don’t apply). Those are called user’s missing value. EpiData Analysis will treat both kind of missing values in the same way.

To define a value as a user’s missing value you have to use the command: MISSINGVALUE.

In the command prompt write:

MISSINGVALUE DUNKLE /9

Where MISSINGVALUE is the command name, DUNKLE is the variable for what we want to define missing values and 9 is the missing value. Please note the use of /.

Try It!


Now you can finish adding labels to the rest of variables. Use the next table to do it:

VariableLabelValue labelsMissing value
TIRAMISUTiramisu (You can jump this)1= "Yes" 2="No" 9="Don't know"9
V13White Mousse au Chocolat1= "Yes" 2="No" 9="Don't know"9
OBSTFruit Salad1= "Yes" 2="No" 9="Don't know"9
BIERBeer1= "Yes" 2="No" 9="Don't know"9
V16Red Jelly1= "Yes" 2="No" 9="Don't know"9
VANILLEVanilla Sauce1= "Yes" 2="No" 9="Don't know"9


  • 3) Adding conditional values

During the data entry process all questions related with specific desserts were left blank in those records where the answer to the question DESSERT (Did you eat any dessert during the event?) was 2 (No). Usually is a good idea to add automatically the 2 value to the former in the data entry process. In our case we have to do it in Analysis.

We are going to use the "IF [logical condition] THEN [action]" command.

In the command prompt you have to write:

IF DESSERT=2 THEN TIRAMISU=2 and press enter

That means, if the value for one record in variable DESSERT is equal 2 assign the value 2 to the TIRAMISU variable on this record. Because we don’t have an ELSE part in the command, the program will do nothing if the value in DESSERT is not equal 2.

You have to do the same with all the questions related with specific desserts.

Describing the cases in terms of Person, Place and Time

A good epidemiological description can help you to develop hypothesis about the way of transmission and the source of infection.


Classifying the records according with the case definition

Our dataset have the needed information to make an automatic classification of individuals.

We need to create several new variables using the DEFINE command, assign new values to these variables using either the LET command or the IF…THEN… command.


a) In the command prompt write:

DEFINE CASE # and press Enter

The CASE variable will be created as a string variable with a width of 1 character. We will use it to store the final classification of our records (1= Cases; 2=No cases).

Now try to create this variable: SYMPNUM (Numeric; 1).We will use it to count how many symptoms had any specific person.

Hints In order to create variables in EpiData Analysis, you can use the symbols that you use when defining a QES file in EpiData Entry.

b) Now we will assign the value 0 to all records: write SYMPNUM=0 and press enter
c) Now we will ask the program to add 1 to the SYMPNUM variable for each symptom that is present in each record. In the command prompt you have to write:


if erbre=1 then symnum=symnum+1 (press enter)
if fever>38.5 then symnum=symnum+1 (press enter)
if uebel=1 then symnum=symnum+1 (press enter)
if Bauchs=1 then symnum=symnum+1 (press enter)
if kopf=1 then symnum=symnum+1 (press enter)

where erbre, uebel, bauchs and kopf are the variable’s names in the original dataset

erbre=vomiting

uebel=nausea

bauchs=stomach pain

kopf=headache


d) Now we can classify our subjects regarding their status of CASES/NO CASES. Write:

if (durchf=1 or symnum>2) and dateonset>dmy(26,06,1998) and dateonset<dmy(30,06,1998) then CASE=1 else CASE=2

(You must write it in only one line)

How we understand this command?: If the patient has Diarrhoea (DURCHF) or more than two of the other symptoms and date of symptoms onset was between 26/06/1998 and 30/06/1998 then the program will assign value 1 to variable CASE in any other situation (ELSE) the program will assign the value 2.

You can check your results doing a BROWSE of all variables of interest: DATEONSET, CASE, FEVER, etc.

It is a good idea to check in this way when you have done an assignment or recodification.

e) In order to know how many cases and no cases we have we can do a distribution frequency of variable CASE. Write FREQ CASE

You can see that there are 102 cases and 189 non-cases.

It is very well documented for the Investigation Team that there was a person (IDNR=68) that reported having started with Diarrhoea some day between the 27th and the 30th, but he was unable to specify exactly the day. So it was taken the decision of including this man as a case even thought the DATEONSET variable was missing.

This point highlights the importance of a good documentation during all the investigation process and keeping a record of all taken decisions.

f) To include that person as a case we can use: IF IDNR=68 then CASE=1


Creating an Epidemic Curve

Describing your cases in terms of Time will give you the clue to understand the mode of transmission of the disease.

An epidemic curve, or "epi curve" for short, is a two-dimensional graph that provides a simple visual display of an epidemic's magnitude and time course. The epidemic curve plots time along the X-axis and number of cases along the Y-axis. Because time is continuous, the epidemic curve is drawn as a histogram (no gaps between adjacent columns), not as a bar chart.

The units of time must be consistent along the length of the X-axis; for example, the same distance must equal 1 day anywhere along the X-axis. For a given graph, the most appropriate units of time for the X-axis depend on the incubation period of the disease, the length of time over which cases are distributed, and the points you wish to communicate with the graph.

One rule of thumb states that the units should be between one-eighth to one-third (e.g., roughly one-quarter) as long as the incubation period of the disease in question. So, for a common-source outbreak of Clostridium perfringens gastroenteritis (usual incubation period 10-12 hours), X-axis units of 2-3 hours would be suitable.

To create an Epidemic Curve with EpiData is easy if the units of time are days or higher. However, it is a little bit more complicated if you want to display hours (like is our case).

In order to create an epicurve of hours we have to deal with two variables: DATEONSET (Date of symptoms onset) and BEGZEIT (Time of symptoms onset). And we have to transform these two variables in a number of hours from an arbitrary point in the time (a point always previous to the first case). We can choose for example the 00:00 hours of the 26/06/1998.

From this arbitrary point we can count the numbers of hours until each case start presenting symptoms.

a) Define a new variable: You need a new variable to store your calculations. In the command prompt write the define command, the name of this new variable for example ONSETHR and the kind of data you want to store in. In this case, your command prompt should look like:

Define ONSETHR ###

Meaning: Create a new variable called ONSETHR; this variable should be numeric, integer with a width of 3 digits.

b) Assign values to the new variable: The values of the new variable will depend on the values in DATEONSET and of the values on BEGZEIT. Regarding the values in BEGZEIT it is easy we only need the two first digit of the 24-hours-format hour (i.e: if BEGZEIT is 1230, we want only 12, if is 145 we only need 1.). Regarding values in DATEONSET we have to calculate the number of days between each case date of onset and the day of reference and multiply it by 24 (24 hours). Finally, because we want to represent the cases accumulated by 6 hours period time, we must transform the previous results dividing it by 6 and rounding it to the closer integer.

In the command prompt you should write the follow syntax:

onsethr=Integer(((DATEONSET-(dmy(26,6,1998)))*24 +(integer(begzeit /100)))/6)


This is a little bit complicated, so we are going to see at this command carefully.

i) First the INTEGER(number)

This is easy: Integer transform any number in an integer. So, we get rid of the decimal part of the number. In our case, we want to transform in an integer the result of the mathematical operation between the first and last brackets.

ii) DATEONSET-DMY(26,6,1998)*24

Epidata will calculate the difference in days between the specific value in DATEONSET in each record and a value of reference (26/06/1998; dmy(26,6,1998) is the syntax write a date in EpiData). Once the difference has been calculated EpiData multiply it by 24 to transform it in hours.

iii) Integer (begzeit/100)

Here EpiData will divide the specific value in BEGZEIT of each record by 100 and then get rid of the decimal part.

Imagine we have a case with DATEONSET=”26/06/1998” and BEGZEIT=100 that means that this case started with symptoms at 1 a.m. of the 26 of june.

* So, 26/06/1998-26/06/1998= 0 days
* 0 days * 24= 0 hours
* 0 Hours + 100/100=1hour
* 1/6=0.16
* Integer(0.16)=0
This case will be in group 0, meaning all cases with DATEONSET=26/06/1998 and BEGZEIT <600 will be in this group.

Let see some examples in our dataset:

Write SORT DATEONSET BEGZEIT in the command prompt and the press enter

Then write BROWSE ONSETDATE BEGZEIT ONSETHR

The first few cases have these values:

DATEONSET BEGZEIT ONSETHR

27/06/1998 100 4 Group 4 are cases with BEGZEIT within the

27/06/1998 400 4 first 6 hours of 27/06/1998(hours 0 to 5)

27/06/1998 400 4

27/06/1998 900 5 Group 5 are cases with BEGZEIT within the

27/06/1998 1000 5 second 6 hours of 27/06/1998(hours 6 to 11)

27/06/1998 1000 5

27/06/1998 1000 5

27/06/1998 1100 5

27/06/1998 1130 5

27/06/1998 1200 6

27/06/1998 1230 6

……………………

To create an Epidemic Curve, the only thing we have to do is

i) Select CASE=1
ii) Histogram ONSETHR. To do that you can either write this command in the command prompt or click on the GRAPHS button, choose Histogram and choose ONSETHR as X variables.
iii) In the graph window you will see an “Edit All” button. Click on it.
iv) You will get a new window (Editing Chart) where you can change some parameters. Click on the Data tab and in the column labelled “Text” You can change 4 by “27-jun-0-5 h.”. Delete 5 (second row). Change 6 by “27-jun-12-17h.” Delete 7 and so on.
v) Now click on the Chart tab and then in the Axis-->Labels-->Style subtabs.
Editing charts properties

Change the Angle value to 45.

vi) When you finish click Close

The results should be similar to:

Epidemic curve


Describing the cases in terms of person and clinical variables.

a) First we want to describe it by Sex (gender). A Frequency distribution is the right way:
i) Click on the Analysis button and choose Frequency
ii) On the dialog window, select Sex and then click on the Pass this button
iii) Sex will be displayed in the selected variable Area.
iv) Click on RUN

The result should be something similar to:


Sex:No.%Cum %
15048.5448.54
25351.46100.00
Total103100%






Try It!


Now you can finish the description of cases using other variables, for instance you can describe the cases by symptoms.


b) To describe or summarize a continuous numeric variable, like age, you need to use some specific measures (measures of central tendency and variation). With EpiData Analysis, you can do it using the describe command
i) Click on Analysis
ii) Select Describe
iii) Select ALTER (Age) and click on the Pass this button
iv) Click on RUN

You will get something like:

VariableN=103SumMean(95% cfi)Minp5p10p25Medianp75p90p95Max
ALTER1002731.0027.3124.32 30.3013.0016.0017.0018.0019.5031.2555.8057.0080.0





You can also describe a continuous numeric variable using a Boxplot Graph (Box-Whisker):

i) Click on the Graph Button
ii) Choose Box plot
iii) From the Drop-down list choose ALTER
iv) Click RUN

Boxplot of Median and Inter Quartile Range (IQR=25-75%). Whisker: 1.5*IQR. Outliers exceed IQR*1.5 . N=100 Box-Plot Chart example

Testing Hypothesis: Analyzing the Risk Factors

At this point in our retrospective cohort study, we are hoping to identify risk factors that might indicate the cause and mode of transmission of the disease. We created a questionnaire that asked both persons who were exposed and those who were not exposed to different foods and beverages if they were ill after that exposition. What we are trying to do here is determine the probability that a risk factor (for example, eating Red Jelly – V16) is linked to some outcome (illness – CASE).

We will do this by creating a 2x2 table and looking at the p-value generated. Remember, the p-value indicates the probability that the association between two variables might be due to chance. For example, if the p-value of two variables equals .75, then the likelihood that the association between them might be due to chance is 75%. On the other hand, a low p-value indicates it is less likely the association between two variables is due to chance. So a low p-value (generally < .05) may indicate that a risk factor (e.g., eating cake) is closely associated with a certain outcome (illness).

Calculating Attack Rates

Because we are in a Retrospective Cohort Study, we can calculate Attack Rate for each food among those that did eat the food and those that didn’t. The true vehicle is likely to have three features:

a) The attack rate is high among persons who ate the food (high food-specific attack rate).
b) The attack rate is low among persons who did not eat the food (so the difference or ratio is high).
c) Most of the cases were exposed, so the exposure could “explain” most, if not all, of the cases.

EpiData doesn’t have a specific command for calculate attack rates; however it is possible to program an attack-rate-like command. We have created a program that calculates it for you.

To calculate the Attack Rate you have to run a program called AR.PGM (you can find it in your project folder).

i) In the command prompt write:
SELECT (To eliminate any previous selection)
RUN AR.PGM
ii) The program will ask you the name of the variable where the outcome status is recorded (in our case CASE, but it can be called in another way in other of your files in the future).
iii) Then the program will ask you for each food or beverage:
  • How to label each food (or beverage); you must write a short but meaningful label
  • The name of the variable where the information about each exposure is stored (for example TIRAMISU). You should try TIRAMISU, DUNKLE, V13, OBST, BIER, V16 and VANILLE.

NOTE: There are more variables that were investigated during the outbreak, but those are the most important.

You must be very careful writing the name of variables. If you make a mistake you will have to close the database (using the close command) and then running again the program.

After each food, EpiData will show the Attack Rate, Attributable Proportion in Exposed (eAF %), Attributable Fraction in population (pAF %) and RR with its 95% confidence interval.

You will have something like:

image:ATTACKRATES.PNG

Interpreting the results

Based on the results you have got, which foods or beverage seem to play a role in the occurrence of illness. Can you identify any potential vehicle or protective factor?

As you can see, several of the foods seem to play a role, and Beer consumption can be a protective factor.

So we can think that this is a multi-vehicle outbreak. Before accepting such hypothesis, we need to evaluate the independent role of each food. To do that, we can make a stratified analysis.

Stratified Analysis: Confounding

We want to asses if Tiramisu is confounding the relationship between DUNKLE (exposure variable) and CASE

To be considered a confounder a variable must meet three criteria:

1) It is associated with the Exposure variable
2) It is a Risk Factor for the disease (There is a statistical association among the confounder and the disease even among those that are not exposed to the exposure variable)
3) It is not a consequence of the exposure variable

To test the first criteria you have to write:

TABLES DUNKLE TIRAMISU

And you get:


image:TIRADUNKLETAB.PNG




You can see that there is a statistical association between both variables

To test the second criteria, you have to:

SELECT DUNKLE=2
  • In order to analysis the relationship between CASE and TIRAMISU only in those who didn’t eat Dark Mousse (DUNKLE)
EPITABLES CASE TIRAMISU

And you get:

image:CASETIRANODUNKLEETAB.PNG




The third criterion is not tested statistically, but it is obvious that consumption of Dark Mousse is not a consequence of consumption of Tiramisu.

So maybe Tiramisu is confounding the real relationship between consumption of Dark mousse and the illness. For this reason it is necessary to do a stratified analysis

In the command prompt write:

SELECT
  • In order to work again with all records
EPITABLES CASE DUNKLE TIRAMISU

And you will get something like:

image:CASEDUNKLETIRAETAB.PNG


Try It!


Now you can make the rest of stratified analysis analysing the role of each food in both the stratum of those that didn’t consume Tiramisu and those that did it.

In each table of stratified analysis you have look at the results of RR for those that didn’t eat Tiramisu.

Is it the RR of have eaten Dark Mousse Chocolat (DUNKLE), or any other dessert, highest than 1 in those that haven’t eaten Tiramisu?

The results of this analysis suggest that Dark and White Mousse as well as Fruit Salad play a role in the illness since their RR are high even among those who did not eat Tiramisu.

It is not clear whether Vanilla Sauce and Red Jelly played a role. Due to the small number during the stratified analysis a “real” association could be undetected.

Looking for Dose-response relationship

One of the criteria supporting causality is the existence of a relationship between the dose of exposition and the response.

We can test this relationship for Tiramisu using the variable TPORTION (How much Tiramisu did you eat?). First we have to assign a value for those who didn’t eat Tiramisu:

IF TIRAMISU=2 then TPORTION= 8

NOTE: We use a value which is higher than any other in this variable to be sure that it is used a reference group by the EPITABLES command.

Now we have to add labels to the variable and its values:

LABEL TPORTION “How much Tiramisu did you eat?”
LABELVALUE TPORTION /8=”None” /1=”Small portion” /2=”Normal “ /3=”Big portion”

And now we will compare any single value against a Reference group, in this case those who didn’t eat Tiramisu (TPORTION=8)

Select TPORTION=8 or TPORTION=1
  • To compare those who eat small portion against those who didn’t eat tiramisu
EPITABLES CASE TPORTION
SELECT
  • To eliminate the previous selection
SELECT TPORTION=8 or TPORTION=2
  • To compare those who eat a normal portion against those who didn’t eat tiramisu
EPITABLES CASE TPORTION
SELECT
  • To eliminate the previous selection
SELECT TPORTION=8 or TPORTION=3
  • To compare those who eat a big portion against those who didn’t eat tiramisu
EPITABLES CASE TPORTION
SELECT
  • To eliminate the previous selection
SELECT TPORTION=8 or TPORTION=.
  • To compare those who don’t know how much tiramisu do they eat against those who didn’t eat tiramisu.
EPITABLES CASE TPORTION /M

NOTE the use of /M to include missing value in the table, otherwise you would get a error message.

As an example, the EPITABLE for those who eat a small portion looks like:

image:SMALLTIRACASEETAB.PNG

Interpreting the results

Based on the results you have got, can you identify any trend?

Analyzing how Beer consumption and Tiramisu eating related with illness

From the crude analysis you have noticed that the occurrence of gastroenteritis was lower among those attendants who had drunk beer. To elucidate whether there is a confusion mechanism here we have to stratify again.

SELECT

  • In order to eliminate any previous selection

EPITABLES CASE BIER TIRAMISU

And you get something like: image:CASEBEERTIRAETAB.PNG


You can see that the relationship between beer consumption and illness is not confounded by tiramisu consumption (both the crude and the adjusted RR are very similar), and there are almost no difference between the RR in both stratum of Tiramisu consumption.

Working with programs

When you are working in the analysis of a dataset is a good idea to save all the commands you are using to manipulate and analyze the data in a file that you can use later if you need it. This is especially useful if you are working with a database in a routine system like a surveillance system. A system where data are updated for example every week and you have to perform the same kind of analysis every week in order to prepare a routine report.

EpiData Analysis allows you to save your commands in a kind of file called programs. A program, a text file with PGM extension, can be edited in any text plain word editor (you can use either Notepad or WordPad). EpiData Analysis includes its own editor.

A PGM file is a series of commands (maybe as simple as only one command) that are executed in a batch process, meaning that they are executed one after other automatically.

  • Saving, recalling and executing programs:

One way to make a program file is to enter commands interactively at the command prompt. If you wish to save the commands already entered, use the SAVEPGM command and the name of a file, for example:

SAVEPGM OUTBREAK.PGM

All the commands used during this working session will be placed in the file, which can then be edited later to remove unwanted commands or add new ones.

Of course you can open your favourite text editor and write there directly all the commands. If you want to use the EpiData Editor, just click on the Editor Button on the Work Flow tool bar of Analysis (a new window will be open).

Once you have saved your commands in a PGM file, you can open it whenever you want using the EpiData Editor (FILE-->Open)

To execute a program you have different alternatives:

i) You can write RUN NAME.PGM at the command prompt (we did it to excute the program to calculate Attack Rates).
ii) In the EpiData Editor you can click on the icon : Run all (or press F9)
iii) Or in the EpiData Editor you can select the group of lines of the program that you want execute (if you only want execute a number on consecutive lines) and then click on the icon: Run select lines only (or press F8).

Epidata Editor in Analysis

You should use programs if you want to recode variables, make calculations using other variables or any other kind of manipulation. The idea is not making permanent changes in your original dataset but keeping it and having an easy way to redo all the process.

Personal tools