spats_tool¶
The spats_tool
command can be used to perform most of the standard
SPATS tasks. The first thing to do to use the spats_tool
is to
designate a directory for your experiment. In general, this should
probably be a new directory for each new data set; for organizational
purposes, it’s best not to have too many other files in the
folder. The directory needs to be on the same filesystem as your data
files.
As an example, we’ll create a folder for a new “testing” experiment:
$ mkdir testing_2017_08_10
$ cd testing_2017_08_10
$ spats_tool init
The tool works using a spats.config
file, which is a text file
that provides the information required for the various tool
functions. spats_tool init
will create a default config file,
be sure to edit it after it’s created. The rest of the information is
about the locations of the target and input files. Here’s a sample:
[spats]
preseq = my_preseq_file.fsa # optional
target = my_target.fa
r1 = /path/to/data/experiment_data_R1.fastq.gz
r2 = /path/to/data/experiment_data_R2.fastq.gz
cotrans = False
[metadata]
...
The preseq
parameter is optional and only required for the pre
tool.
The target
, r1
, and r2
parameters are
required for most tools. If this is a cotrans experiment,
set cotrans = True
; then provide the paths to the
target/R1/R2 files. Note that you can use .gz
files for r1/r2, in
which case they will be decompressed on the fly, and cleaned up after
the run completes.
If you wish to set any configuration for the SPATS run or reads
analysis (see below), you can set parameters according to the
run.Run
documentation; for example,
minimum_target_match_length = 12
.
The spats_tool
command must be run in the experiment directory you
created with the spats.config
file.
spats_tool pre¶
The first command you may wish to run is pre
, which takes the
information from a *.fsa
(ABIF) file and creates a plot.
$ spats_tool pre
:pre-sequencing data processed to pre.spats
:pre complete @ 0.02s
This extracts the data and stores it a format from which it can be easily inspected and plotted.
spats_tool reads¶
The reads
command analyzes the experimental data and creates a
` reads.spats` file, which can be used with the visualization tool to
analyze the quality of the data.
$ spats_tool reads
:** removing previous reads.spats
:using native reads
Lookup table: 1076 R1 entries, 121 R2 entries.
Lookup table: 1076 R1 entries, 121 R2 entries.
Processing pairs...
Created 8 workers
^^^^^^^^^.v........vvvvvvvvxxxxxxxx
Aggregating data...
Successfully processed 3640 properly paired fragments:
...
:tags processed to reads.spats
:reads complete @ 59.38s
spats_tool run¶
The basic command is run
, which performs the SPATS run to compute
site reactivities:
$ spats_tool run
:using native cotrans processor
:decompressing /projects/b1044/.../EJS_6_F_10mM_NaF_Rep1_GCCAAT_R1.fastq.gz
:decompress R1 @ 41.51s
:decompressing /projects/b1044/.../EJS_6_F_10mM_NaF_Rep1_GCCAAT_R2.fastq.gz
:decompress R1 @ 93.62s
:wrote output to run.spats
:run complete @ 134.99s
All spats_tool
work in the experiment directory and update the
spats.log
file there; for example, in this case, it looks like:
2017/08/10 13:22 : run, 134.99s
- ** removing previous run.spats
- using native cotrans processor
- decompressing /projects/b1044/.../EJS_6_F_10mM_NaF_Rep1_GCCAAT_R1.fastq.gz
- decompress R1 @ 47.76s
- decompressing /projects/b1044/.../EJS_6_F_10mM_NaF_Rep1_GCCAAT_R2.fastq.gz
- decompress R1 @ 93.00s
- wrote output to run.spats
- run complete @ 134.99s
As the output and log indicate, the results of the run are written to
the run.spats
file, which is a sqlite-DB file that can be used by
spats_tool
to dump results and create plots. All tools append to this
log file, so you have a record of all analyses performed, including
date/time stamps.
spats_tool dump¶
The dump
command is used to access the raw data and dump it to
CSV. Requires a dump type – options:
spats_tool dump reads
: dumps the tags data for the reads analysis toreads.csv
spats_tool dump run
: dumps the treated/untreated count, beta, theta, and rho values from the run analysis to CSV files named for the corresponding targets.