The Wold Lab

Caltech Biology|Bioinformatics Lab

BHUtils

BHUtils is a Python package containing useful utilities which were developed for the BioHub project, but can be used independently. These include a disk based multi-record FASTA reader (capable of pulling sub chunks of entire chromosomes), download utilities, file decompression utilities, reverse complementing, batch blasting (requires BioPython currently), handling of blast results, batch blating, blast db creation, blat nib creation, etc.

About Package

bhutils.BlastDB:
 * Allows for easy blasting (wrapper around biopython's blast utils)
 * Contains code for generating a blast database (i.e. formatdb
   commandline) given a list of fasta files. (See bhutils.Fasta for
   useful fasta handling code).

bhutils.BlastHandler:

 * Filter classes which take the output from BlastDB module and filter
   the results. Like only return results that have n-mismatches.

bhutils.Blat:

 * Util for using blat.
 * Util for creating nibs (blat db files) from list of fasta files.
 * Blat parser

bhutils.Decompress:
 * Allows decompression of files from Python (.zip, bzip2, tar,
   gzip). Uses one function to decompress all compression types from
   within Python. No need to import individual types of decompression
   modules. (Used included Python modules for actual decompression).

bhutils.DirectoryBuilder:
 * Given a path, builds a directory tree (more like a branch).

bhutils.DownloadUtil:

 * A utility to download one or many files from http or ftp. Can use
   wildcards. on linux where wget exists, it uses wget otherwise it
   falls back to modules included with Python.

bhutils.Fasta
 * Disk based Fasta reader.
 * Only retrieves sequence when getSequence*() functions are
   called. (Seeks to correct locations in the file base on a
   precalculated indexing of the fasta file).
 * Can write list of FastaSequence objects to FASTA file with multiple
   FASTA sequences in it.

bhutils.SequenceUtils:
 * ReverseComplement code

License

LGPL

Download

bioutils v0.1.4 [ zip | tar.gz ] - 2008Jan17

Older releases