Contact Amy Williams with questions or bug reports.
Download: hapi-1.03-x86_64.tgz
me@myhost$ ./hapi-[mr/ml]
Usage: hapi [OPTIONS] [marker list file] [map file] [pedigree file]
Options:
-l, --log <filename> log to file <filename>
-d, --data-analysis run data analysis
--print-fams-trunc <fileprefix> print truncated pedigree file and quit
--print-trans-homologs <min fam children> print CSV file with transmitted
homologs for each child,
for specified family size
--print-haplotypes <min fam children> print CSV file with haplotypes
for specified family size
--print-text print either transmitted homologs or
haplotypes in text format (not CSV)
--print-all-trans-homologs print all transmitted homologs
for all families to one large CSV file
named 'all-trans-homologs.csv'
Required arguments are listed in brackets. These files are in the same format
that Merlin
uses. We describe each below. Note: the distribution of Hapi includes
a simple example with each of these files.
M rs1
M rs2
M rs3
Note that the names of the markers need not be rs id numbers, but can be any
sequence of characters. These names must be the as those listed in the map
file. Also note that the order of the markers has no physical meaning and
can be arbitrary (thus markers on different chromosomes can occur intermixed,
if desired). The map file specifies where the markers reside physically.
The following is an example map file, continuing with the above example:
1 rs1 1.00
1 rs3 1.356
1 rs2 1.895
The above example specifies that rs1, rs2, and rs3, appear on chromsome 1 with
rs3 appearing between rs1 and rs2. These markers are tightly linked, spanning
a distance of 0.895 cM.
Each line contains five required columns plus SNP genotypes for each of the markers listed in the "marker list file". The first column lists a name for the family of the individual; Hapi ignores this character string (note: if this is problematic for your application, contact me). The second column gives a numerical identifier for the individual within the pedigree; this value must be positive and non-zero. The third column lists the numerical identifier for the individual's father, and the fourth column lists the identifier for the mother. Use a '0' or 'x' character to designate an unknown father or mother. The fifth column lists the person's gender, with a 1 for males, 2 for females, and 0 for unknown. The remainder of the line contains the marker genotypes in the same order as in the marker list file. The genotypes must be numerical identifiers and separate alleles for a given locus can either be separated by a '/' or by a space. Alleles coded as 0 are reserved for missing data.
The following is an example pedigree file for a nuclear family with three children that includes genotypes for three loci:
fam1 10 x x 1 1/1 3/4 2/1
fam1 12 x x 2 2/1 3/4 2/2
fam1 21 10 12 1 1/1 3/3 1/2
fam1 22 10 12 1 1/2 3/4 1/2
fam1 23 10 12 2 1/1 3/4 2/2
Person 10 is the father, 12 is the mother, and 21, 22, and 23 are the
children. Child 21 and 22 are male and 23 is femaleIf desired, a comment may appear on a line by itself if a '#' occurs as the first character. For example:
# This is a comment and is ignored
fam1 10 x x 1 1/1 3/4 2/1
...
Contact Amy Williams with questions or bug reports.