Python script to harvest data from STRUCTURE results folder
Web interface at http://taylor0.biology.ucla.edu/structureHarvester/ .
This script replicates the functionality found in the web version, without the fancy graphical interface or auto-magical plotting.
Latest version at https://github.com/dentearl/structureHarvester .
Description
structureHarvester.py is a Python script capable of extracting all the relevant data from STRUCTURE results files. You will need to have Python installed.
Once you cd into the directory using your favorite terminal app, make sure it has permissions ( chmod 755 structureHarvester.py
) and then execute it.
And then enjoy!
Features
- WWW
Structure Harvester can be easily modified to run through a web interface -- in fact, this has already been done! Try it out here.
- HARVESTS
the Run Number, number of Assumed Populations (k), Estimated Ln, Mean Likelihood and Variance in Likelihood, For EVERY File in the Results Folder! No more opening 120 files and copying data for hours! The script takes less than a second for 110 files.
- EVANNO
method! (Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study [Evanno et. al 2005 Molecular Ecology 14, 2611-2620])
- CLUMPP
.indfile generation!
Download
structureHarvester.zip version 0.6.93 (Oct 2012), now includes harvesterCore.py
which contains all of the shared functions and variables between the stand-alone and web versions.
structureHarvester.py version 0.6.91 (Feb 2012)
Version 0.6 (March 2011) Following correspondence with Dr. Goudet of Evanno et al. 2005 I have made changes to the algorithm that calculates delta K.
struct_harvest.pl Version 0.3 Notice, incorrect delta K calculation in this version
struct_harvest.pl Version 0.2.2 Notice, incorrect delta K calculation in this version
Usage
[dearl @ demo]$ ./structureHarvester.py --help Usage: structureHarvester.py --dir=path/to/dir/ --out=path/to/dir/ [options] structureHarvester.py takes a STRUCTURE results directory ( --dir ) and an output directory ( --out will be created if it does not exist) and then depending on the other options selected harvests data from the results directory and performs the selected analyses Options: --version show program's version number and exit -h, --help show this help message and exit --dir=RESULTSDIR The structure Results/ directory. --out=OUTDIR The out directory. If it does not exist, it will be created. Output written to summary.txt --evanno If possible, performs the Evanno 2005 method. Written to evanno.txt. default=False --clumpp Generates one K*.indfile for each value of K run, for use with CLUMPP. default=False
How to cite.
Earl, Dent A. and vonHoldt, Bridgett M. (2011) STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conservation Genetics Resources DOI: 10.1007/s12686-011-9548-7 Version: v0.6.8 Oct 2011
References
Evanno et al., 2005. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Molecular Ecology 14 , 2611 - 2620 link
M. Jakobsson, N. Rosenberg 2007. CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics 23(14): 1801-1806. CLUMPP. link.
J. Pritchard, M. Stephens, P. Donnelly. 2000. Genetics 155:945-959. STRUCTURE. link.
License
Copyright (C) 2007-2011 by Dent Earl, dearl (a) soe ucsc edu Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.