The Data Harvesting Manager is a tool to manage and maintain any harvest files produced by CCP4 programs. It will run tasks to validate the format and consistency of produced harvest files in the same dataset, convert the harvest files from CIF to XML and is also an interface to the PDB_EXTRACT package which extracts additional information from harvest files, output log files and output MTZ files for deposition.
This program will convert a selected harvest file into XML. It requires one input harvest file from the list, and an output XML file (see CIF2XML program documentation). This is an interface to the PDB_EXTRACT program suite, which will extract additional relevant information from output files of certain structure solution programs into a CIF file for use during deposition. Under programs, choose "Run Program to Extract additional information for deposition".
There are three steps where information can be extracted:
For detailed documentation, see PDB_EXTRACT.
Example: MAD Phasing using the CCP4 Programs MLPHARE and REVISE.
This will also run on the command line with the following command:
Example: Using the CCP4 Program DM.
This will also run on the command line with the following command:
Example: Using the CCP4 Program REFMAC5.
This will also run on the command line with the following command:
Programs
Validating Harvest Files
This program will check any highlighted files that it is written in correct mmCIF syntax. It will also output only the common information that is found in all harvest files written by CCP4 programs. If more than one file is highlighted, and the "Cross Validate Files" button is checked, the program will check for differences between the 2 files of certain data. (See cross_validate program documentation).
Convert CIF files to XML
Extract additional information for deposition
1. Heavy atom phasing -> Requires output from either CNS, Mlphare, Solve, Sharp, SnB, ShelxD/ShelxS
2. Density Modification -> Requires output from either CNS, DM, Solomon, Resolve, Sharp or ShelxE
3. Structure Refinement -> Requires output from either CNS, Refmac5, ShelxL, TNT or ARP/wARP
For each phase, the name of the program from which the output files were generated needs to be specified from the menu as well as the required files. The resulting file is written in CIF format and organised so that it is ready for deposition.
1. Heavy Atom Phasing
This ideally requires the harvest file from MLPHARE, and the log file from the program REVISE. This will extract phasing and wavelength information.
Select "Extract information from Heavy Atom Phasing step". A new folder will appear. Select method type and program. eg: "Using MAD and MLPHARE". Then, declare the name of the MLPHARE Harvest file as a CIF file, and the REVISE log file as the LOG file. It is not necessary to declare a PDB file for this example, since MLPHARE does not produce a final PDB file at this stage. Then choose a name for the output CIF file and run the task.
pdb_extract -p MLPHARE -iCIF [MLPHARE HARVEST FILE] -iLOG [REVISE LOG FILE] -o [OUTPUT CIF FILE]
2. Density Modification
This requires only the log file from the DM program, and will create a CIF file containing some phasing statistics.
Select "Extract information from Density Modification step" and choose the DM program. Declare the DM log file as the LOG file and declare the name of the output CIF file. Run the task.
pdb_extract -d DM -iLOG [DM LOG FILE] -o [OUTPUT CIF FILE]
3. Structure Refinement
This ideally requires the REFMAC5 harvest file and the output PDB file. A file will be written which combines all relevant information from the harvest file and the PDB file into CIF format, including refinement and model statistics, and model coordinates.
Select "Extract information from Structure Refinement step". Then select method type and program. eg: "Using MAD and REFMAC5". Then, declare the name of the refined PDB file and the REFMAC5 harvest file. Then choose a name for the output CIF file and run the task.
pdb_extract -r REFMAC5 -iCIF [REFMAC5 HARVEST FILE] -iPDB [REFMAC5 PDB FILE] -o [OUTPUT CIF FILE]
Output
The output of these programs can be checked at a glance by using the window in the "Output" folder at the bottom of the task window. This will highlight whether the program has completed successfully or not, and will highlight any potential problems in the running of the programs.