University of Southampton School of Medicine

: Research

 
     
 
FAQ / troubleshooting     |     LDMAP on Iridis     |   map summary       
 

1) FTP from Cedar or HTTP from http://www.som.soton.ac.uk/research/geneticsdiv/epidemiology/LDMAP/



2) From ver. 29Apr06 onward, LDMAP_cluster performs automatic online update. This ensures the latest version of the program is delivered to all users. (29Apr06).

If you are still using the older version of program, you could check the version date on the program header with the version date on the website
here.





3) LDMAP_cluster program is able to handle multi-dataset submission from ver. 10Apr06 onward. It is recommend to submit 2 datasets (~60k SNPs each) at the same time to avoid exceeding the 5Gb storage space. However if already have 2 datasets running, and have checked you have more than 3Gb left ("checkseg" and "du -s ../<username>", by all means submit another one. Each dataset consumes ~ 2 - 2.5Gb.. depending on the # of SNPs the dataset has.



4) You are given 5Gb of storage space on Iridis 2, use "du -s ../<userID>" to check available space, click here.



5) The time for constructing a LDU map from SNP dataset is influenced by many factors such as the size of the dataset, the availability of computing resource on Iridis 2. Currently (27Apr06), the maximum limit of this program is ~70,000 SNPs. If a dataset contains >70k SNPs, it is split into smaller datasets with an overlap of ~200 SNPs. Under favourable conditions, the LDU map could be constructed in 3 - 5 hrs from a single dataset of 68k SNPs (34 segments running in parallel). Typically under the normal usage load on Iridis 2, you could expect to have the results in 24 hrs.



6) The segment / job is terminated (killed) from the computer cluster once the requested time (10 hours) is up. This is very rare, only 7 segments has exceeded the 10 hrs out of 4195 segments. Please contact me if you encountered such instance. Amendment on the submission script is necessary to request a longer computing time.



7) No further submission of dataset is possible, and the running jobs will not be able to create or write any new result to the existing files. Use "checkSeg" to free more space.



8) Use "checkSeg", "qstat -a" or "showq", click here. A 2000 SNPs segment (excl. extended / contig region) takes ~2 - 5 hrs, however some segments do exceed the 10 hrs computing time. Please click here for more details.



9) Under rare circumstances,  there might be job(s) or segment(s) shown as "Idle" among the completed segments on the list from "checkSeg". The "idle" segment(s) will not be found on either  "qstat -a" or  "showq". This is accounted for by the segment(s) not being submitted / executed by the PBS. So far such incidents have been reported 4 times out of the 4195 segments / jobs.

The LDU map cannot be assembled if anyone of the segments is missing. There are two ways to resolve this problem. One way is to delete all the existing results and re-run the LDMAP_cluster with the same dataset. The other way is to save / transfer the results of the completed segments ("retrieveResult") and manually re-submit the "idle" segment(s) to Iridis.

It is highly recommended to re-submit the idle one(s) only rather than re-computing all the segments again. The simplest way is to look for the submission script associated to the "idle" segment (e.g. idle segment = segment 5 in dataset JPT6_3, then the submission script you need is "jpt6_3.dat_seg_5_6" located at the home directory).

Once located the submission script, re-submit the job via "
qsub" manually from the home directory.

qsub jpt6_3.dat_seg_5_6

Although an additional segment (i.e. 6) is re-submitted along the "idle" segment 5, it is better than amending the submission script as below:

If you really want submit the job with a single segment, please read on... Since LDMAP_cluster embeds two jobs (segments) for each submission, it is necessary to amend the script. Below is an example of the script for a single job for submitting segment 5.


#!/bin/sh
#PBS -l walltime=10:00:00
#PBS -l nodes=1
#PBS -N jpt6_3.dat_seg_5
cd jpt6_3.dat_seg_5
ldmapper1+ jpt6_3.dat job int_5 ter_5 pro_5 7900 10100




10) The segment(s) / job(s) has/have exceeded the 10 hrs computing time and has/have been removed from the cluster. It is quite rare for a segment to take longer than 10 hrs to compute. Examples have been reported in 7 out of 4195 segments. Please contact me if you encounter such an instance. From 19May06 version onward, the window size has been optimized to 75 SNPs, there should not be any segment(s) / job(s) to exceed 10 hours computing time.

The segments that have exceeded 10 hrs computing time are:

dataset chb6_2  segment 20;
dataset jpt6_2  segment 20;
dataset jpt6_3  segment 13;
dataset ceu8_2  segment 17 and 18;
dataset jpt8_2  segment 16;
dataset ceu11_1 segment 23





11) This is more likely to be associated with the dropped SNP, please click here for further discussion. Currently this is fixed manually. Please contact me if you encounter such an instance.



12)
Yes, all researchers or research students of Southampton University would have access to LDMAP_cluster. Please click here for more details.



13)
Currently, it is not possible for external members of Southampton University to access LDMAP_cluster on Iridis cluster at the University. A web portal is planned to extend the accessibility of LD analysis as a service. Alternatively, you could run the program on your Beowulf cluster.



14) Yes, please click here for more information.

 

 

15)
No. The chromosome-wide LDU map is constructed employing the segmental approach, where the chromosome is split into segments of 2000 SNPs, and an epsilon is estimated in every segments (pro files).

Please note that although it is a segment of 2000 SNPs, the actual number of SNPs in each segment is greater. The extra SNPs are accounted for the extended regions at the split end(s) of the segment: 2100 SNPs at the first segment; 2200 SNPs at the middle ones and a variable number of SNPs (1000 - 2999) at the last segment, click here for further information.

So, the epsilon in each pro file represents a region of >2000 SNPs.

The number of SNPs for the end segment can be inspected in the file, "<datasetfile>_contigInfo.txt", under the result directory.

(e.g. \result_ceu1_3.dat_1145206991\ceu1_3.dat_contigInfo.txt). Click here to view an example of the file.



 

Under Construction......

email: wwsl@soton.ac.uk


Copyright © 2006

created 27Apr06
updated 16Oct06


































 

 
 
 
 
Top of Page