MID is a tool to detect microinversions (MIs) by mapping initially unmapped short reads back onto reference genome sequence. The input file is unmapped BAM file, and the output files contain detailed alignments of each unmapped read with MIs (output_i) and a list of unique MIs (o_inv).

PREREQUISITES
64 bit GNU/Linux
GCC 4.0 with Standard C++ Library
Python 2.7


USAGE
1.Download MID source code(MID.tar.gz) from http://cqb.pku.edu.cn/ZhuLab/MID

2.Install bowtie from http://sourceforge.net/projects/bowtie-bio/files/bowtie
or http://cqb.pku.edu.cn/ZhuLab/MID
Bowtie should be in the systems environment variable $ PATH

3.Get UCSC hg19.fa and pre-built bowtie index of UCSC hg19 from
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/ 
http://bowtie-bio.sourceforge.net/manual.shtml
Or download them from http://cqb.pku.edu.cn/ZhuLab/MID
Extract index by
$ tar -xzvf index.tar.gz

4.Download cython, pysam and Biopython module of python from
https://pypi.python.org/pypi/Cython/ 
http://code.google.com/p/pysam/downloads/list
http://biopython.org/DIST
Or download them from http://cqb.pku.edu.cn/ZhuLab/MID
Install the modules by
$ tar -xzvf Cython-0.17.4.tar.gz
$ cd Cython-0.17.4
$ python setup.py install
$ tar -xzvf pysam-0.7.tar.gz
$ cd pysam-0.7
$ python setup.py install
$ tar -xzvf biopython-1.66.tar.gz
$ cd biopython-1.66
$ python setup.py install

5.Extract MID.tar.gz by $ tar -xzvf MID.tar.gz
Run the program by command line
$ python MID.py -a unmapped -r hg19.fa -i index -v erranchor -p parallel -s anchor -k kmer -m matchnum -e errkmer -g mergenum -c cutsize
[Option]
-a/--unmapped unmapped BAM file of 1000 Genomes Project sample (e.g., HG01880.unmapped.ILLUMINA.bwa.ACB.low_coverage.20120522.bam)
-r/--reference reference sequence (e.g., hg19.fa)
-i/--index bowtie index (e.g., hg19)
-v/--erranchor error number in the anchors (default: 1)
-p/--parallel number of alignment threads (default: 1)
-s/--anchor length of anchors (default: 18)
-k/--kmer length of kmers (default: 14)
-m/--matchnum number of matching serial (default: 5)
-e/—-errkmer error number in each kmer (default: 2)
-g/--mergenum deviation for merging two subsequences (default: 3)
-c/--cutsize length of cutting size (default: 0)

6.If you want to compile the files by yourself, please remove the previous executable files and compile the files after extracting MID.tar.gz in step(5) by
$ make clean
$ make
Remove all the files of MID by $ make remove


EXAMPLE 
    For HG01880 from 1000 Genomes Project, the command line would be: 
	$ python MID.py -a HG01880.unmapped.ILLUMINA.bwa.ACB.low_coverage.20120522.bam -r hg19.fa -i hg19 -v 1 -p 1 -s 18 -k 14 -m 5 -e 2 -g 3 -c 0

    The input and output files can be downloaded from http://cqb.pku.edu.cn/ZhuLab/MID
    Extract input file by $ tar -xzvf input.tar.gz 
    Extract output files by $ tar -xzvf output.tar.gz
