Contact Amy Williams with questions or bug reports.
Download: hapi-1.03-x86_64.tgz
me@myhost$ ./hapi-[mr/ml] Usage: hapi [OPTIONS] [marker list file] [map file] [pedigree file] Options: -l, --log <filename> log to file <filename> -d, --data-analysis run data analysis --print-fams-trunc <fileprefix> print truncated pedigree file and quit --print-trans-homologs <min fam children> print CSV file with transmitted homologs for each child, for specified family size --print-haplotypes <min fam children> print CSV file with haplotypes for specified family size --print-text print either transmitted homologs or haplotypes in text format (not CSV) --print-all-trans-homologs print all transmitted homologs for all families to one large CSV file named 'all-trans-homologs.csv'Required arguments are listed in brackets. These files are in the same format that Merlin uses. We describe each below. Note: the distribution of Hapi includes a simple example with each of these files.
M rs1 M rs2 M rs3Note that the names of the markers need not be rs id numbers, but can be any sequence of characters. These names must be the as those listed in the map file. Also note that the order of the markers has no physical meaning and can be arbitrary (thus markers on different chromosomes can occur intermixed, if desired). The map file specifies where the markers reside physically.
The following is an example map file, continuing with the above example:
1 rs1 1.00 1 rs3 1.356 1 rs2 1.895The above example specifies that rs1, rs2, and rs3, appear on chromsome 1 with rs3 appearing between rs1 and rs2. These markers are tightly linked, spanning a distance of 0.895 cM.
Each line contains five required columns plus SNP genotypes for each of the markers listed in the "marker list file". The first column lists a name for the family of the individual; Hapi ignores this character string (note: if this is problematic for your application, contact me). The second column gives a numerical identifier for the individual within the pedigree; this value must be positive and non-zero. The third column lists the numerical identifier for the individual's father, and the fourth column lists the identifier for the mother. Use a '0' or 'x' character to designate an unknown father or mother. The fifth column lists the person's gender, with a 1 for males, 2 for females, and 0 for unknown. The remainder of the line contains the marker genotypes in the same order as in the marker list file. The genotypes must be numerical identifiers and separate alleles for a given locus can either be separated by a '/' or by a space. Alleles coded as 0 are reserved for missing data.
The following is an example pedigree file for a nuclear family with three children that includes genotypes for three loci:
fam1 10 x x 1 1/1 3/4 2/1 fam1 12 x x 2 2/1 3/4 2/2 fam1 21 10 12 1 1/1 3/3 1/2 fam1 22 10 12 1 1/2 3/4 1/2 fam1 23 10 12 2 1/1 3/4 2/2Person 10 is the father, 12 is the mother, and 21, 22, and 23 are the children. Child 21 and 22 are male and 23 is female
If desired, a comment may appear on a line by itself if a '#' occurs as the first character. For example:
# This is a comment and is ignored fam1 10 x x 1 1/1 3/4 2/1 ...
Contact Amy Williams with questions or bug reports.