Parse PIR


Introduction

Parse PIR formatted data (protein sequences, aligned).

If more than one sequence are found from the file and all sequences have equal number of characters, then an alignment object is created too.

File format

The ASCII formatted file is expected to contain one or more sequence entries. Each sequence entry must have:

  1. Line beginning with >P1; and followed by name for the sequence
  2. Line beginning with either structure or sequence
  3. The third line and subsequent lines are assumed to contain the sequence, in capital letters, possibly containing whitespace. The asterisk (*) indicates the end of sequence and thus the end of the entry.

Configuration file entries

There are no configurable options at this moment.


Contents