Download DISTANCE USER'S GUIDE
Transcript
DATA> In each of the data formats, the data can be put on as many lines as needed but each line must not exceed 80 characters. Values on a line must be separated with commas or one or more spaces. Notice that a semi-colon separates different types of input data when both grouped and ungrouped data formats are used for clustered populations and specifies the end of data input for the sample. Unlike commands, information entered beyond the semi-colon will create an error. The following example for format 1 would not be valid because distances are grouped and cluster sizes ungrouped: n1,n2,n3; s1,s2,s3,...,sn; Instead, the grouped data (frequencies) and ungrouped data (cluster sizes) must start on a separate line as shown in Table 1. If there are no observations for a SAMPLE, data values are not given, but the semicolons must still be entered to indicate a sample of size 0. For formats 1 and 4, two semicolons must be given on separate lines and for formats 2, 3 and 5, only one semi-colon is required. If the observed objects are clusters, we recommend that you do not use format 1 and 4 in Table 1. Instead, we recommend that the measurements (distance and cluster size) be entered ungrouped (EXACT) (formats 2 or 5), even if they were recorded in intervals (grouped). Use the interval midpoint as the distance measurement for each observation in an interval. This enables a more detailed analysis because the distance and cluster size measurements are paired for each object. For instance, if a WIDTH is chosen which truncates some of the distance observations, it is not possible to discard the appropriate cluster sizes unless the measurements are paired as they are in ungrouped data entry. Likewise, if you wish to use cluster sizes which were observed within a pre-specified distance or perform a size-bias regression, this can only be done if distances and cluster size measurements are entered in pairs. Proper estimation with the true interval nature of the distance data is obtained by specifying INTERVALS with the DISTANCE command at the ESTIMATE prompt (pg:110-111,149-150). This also allows one or more intervals to be combined into larger intervals in the analysis. A simple example with "unrealistically" small numbers of observations and intervals is given below to show how interval distance data are entered as exact using the midpoint and then analyzed using the intervals in which they were collected: OPTIONS; DIST=PERP; <-- exact is assumed OBJECT=CLUSTER; END; DATA; SAMPLE/EFFORT=14.5; .5 2 .5 4 .5 1 1.5 6 1.5 2 2.5 7; SAMPLE/EFFORT=4.5; .5 1 1.5 4 2.5 2 2.5 6; END; ESTIMATE; DIST/INTERVALS=0,1,2,3; <-- data originally collected in these intervals ESTIMATE/KEY=HN /ADJUST=POLY; END; The commands, which are valid at the order. DATA> prompt, are described below in alphabetical 38