Skip to content

Releases: biotite-dev/biotite

Biotite 0.23.0

03 Aug 15:30

Choose a tag to compare

Changelog

Additions

  • Improved example gallery
    • Added minigalleries in the API reference to get tangible examples for the
      respective function/class
    • Added support for animated Matplotlib plots
    • Using Ammolite for rendering
      PyMOL images
  • Added support for new RCSB search API
    • New database.rcsb.Query classes, that reflect the entirety of the new
      search API, including sequence, sequence motif and structure searches
      • Multiple database.rcsb.Query objects can be combined/negated using the
        operators |, & and ~
    • Added the return_type, sort_by and range parameter to
      database.rcsb.search()
    • Added database.rcsb.count() function to count the number of results a
      database.rcsb.Query would yield in a less costly way than
      database.rcsb.search()
  • Increased indexing speed in biotite.structure.BondList
  • Added attribute sequence.Sequence.alphabet property, that is equivalent to
    sequence.Sequence.get_alphabet()
  • Added convenience functions fastq.get_sequence(), fastq.get_sequences(),
    fastq.set_sequence() and fastq.set_sequences()
  • Drastically increased writing speed of sequence.io.fasta.FastaFile
  • Increased mapping speed of sequence.AlphabetMapper
  • Added sequence.Alphabet.is_letter_alphabet() method
  • Added general sequence I/O convenience functions
    sequence.io.load_sequence(), sequence.io.load_sequences(),
    sequence.io.save_sequence() and sequence.io.save_sequences() that derive
    the appropriate File class from the suffix of the file name.

Changes

  • The omit_chain parameter has been removed from database.rcsb.search()
  • The old database.rcsb.Query classes have been removed
  • Removed python setup.py test and python setup.py build_sphinx commands,
    please use pytest and sphinx-build directly instead
  • Renamed sequence.NucleotideSequence.alphabet to
    sequence.NucleotideSequence.alphabet_unamb
  • sequence.io.fastq.FastqFile returns its entries only as str instead of
    sequence.NucleotideSequence for consistency with
    sequence.io.fastq.FastaFile
    • The method sequence.io.fastq.FastqFile.get_sequence() is deprecated
    • The method sequence.io.fastq.FastqFile.get_seq_string() returns the
      sequence as a str instead of a sequence.NucleotideSequence

Fixes

  • Fixed expect_looped parameter in
    structure.io.pdbx.PDBxFile.get_category()
  • Fixed error in structure.io.pdbx.PDBxFile, that was raised, if a PDBx
    field and its single-line value are in separate lines
  • Added check for boolean mask length, when a boolean mask is given as index
    to biotite.structure.BondList
  • Changed chain_id dtype from 'U3' to 'U4' (#215)

Biotite 0.22.0

04 Jun 13:26
8979702

Choose a tag to compare

Changelog

Additions

  • Added structure.filter_nucleotides()
  • structure.io.pdbx.get_sequence() is able to parse a
    sequence.NucleotideSequence from a PDBx file in addition to
    sequence.ProteinSequence
  • Added structure.base_pairs() for determining base pairs in nucleic acid
    structures
  • Added structure.get_residue_starts_for()
  • Added structure.check_atom_id_continuity()
  • Added structure.renumber_atom_ids() and structure.renumber_res_ids()
    to fix structures with discontinuous atom/residue IDs
  • Added get_model_count() to structure.io.pdb, structure.io.pdbx,
    structure.io.mmtf and structure.io.gro to obtain the total number
    of models
  • The model parameter in get_structure() in structure.io.pdb,
    structure.io.pdbx, structure.io.mmtf and structure.io.gro supports
    negative values to start indexing beginning from the last model
  • Increased performance of residue and chain-related functions
    (e.g. structure.get_residue.starts())

Changes

  • Revamped altloc ID handling (#194)
    • Instead of choosing each alternate location individually there are three
      options:
    • 'first' choses always chooses the atoms with the first altloc ID
      for each residue
    • 'occupancy' choses always chooses the atoms with the highest occupancy
      for each residue
    • 'all' does not filter any altloc IDs and adds the altloc_id
      annotation to the resulting structure.AtomArray or
      structure.AtomArrayStack
  • Renamed structure.check_id_continuity() into
    structure.check_res_id_continuity(); structure.check_id_continuity()
    is still available, but is deprecated

Fixes

  • Fixed structure.BondList being iterable, yielding nonsense values
  • Improved element guesses in structure.io.pdb.PDBFile when the
    element column is missing (#188)
  • Fixed parsing of single models from structure.io.mmtf.MMTFFile (#205)
  • Fixed missing unit cell values in structure.io.pdbx.get_structure()
    raising an error; the box attribute is set to None instead

Biotite 0.21.0

28 Apr 13:48
1c806ce

Choose a tag to compare

Changelog

Additions

  • More functionality for structure.BondList
    • __contains__() method to test whether a bond exists
    • find_connected() identifies systems of connected atoms (aka molecules)
  • Added frame wise iteration of trajectory files for saving memory
    • structure.io.TrajectoryFile.read_iter() yields coordinates, box and time for each frame
    • structure.io.TrajectoryFile.read_iter_structure() yields an structure.AtomArray for each frame
  • Added ability to read entire biological assemblies from mmCIF files
    • structure.io.pdbx.list_assemblies() lists the available assemblies
    • structure.io.pdbx.get_assembly() returns the given assembly as
      structure.AtomArray or structure.AtomArrayStack
  • Added the expect_looped parameter to
    structure.io.pdbx.PDBxFile.get_category
  • structure.info.vdw_radius_single() provides VdW radii also for more
    uncommon elements
  • Added structure.get_residue_masks(), which masks all residues to which the
    given atoms belong
  • Added structure.repeat() functions to repeat atoms multiple times in the
    same model with different coordinates
  • Added a bunch of new examples to the gallery

Changes

  • temp_file() and temp_dir() is deprecated, use the Python standard library
    module tempfile instead
  • For all File classes, read() is now a class method,
    e.g. pdbx_file = PDBxFile.read(), the old instance method is deprecated
  • database.rcsb.fetch() and database.enrez.fetch() overwrite an existing
    file if it is empty

Fixes

  • A newline character is appended to the end of file, when writing text files
  • Fixed structure.CellList when using the periodic parameter in combination
    with the selection parameter; before unallocated memory was potentially
    accessed

Biotite 0.20.1

28 Feb 14:06
04c3c97

Choose a tag to compare

Changelog

Fixes

  • Fixed support for msgpack 1.0

Biotite 0.20.0

27 Feb 10:54
a8f7e16

Choose a tag to compare

Changelog

Additions

  • Added structure.from_template() to create a structure.AtomArrayStack from an existing atom array (or stack) and coordinates
  • Added ignore parameter to sequence.io.genbank.get_annotation() to ignore the given feature keys
  • Added sequence.graphics.plot_plasmid_map() for visualizing sequence.Annotation objects as plasmid
  • Added a bunch of new examples to the gallery
  • Added support for Python 3.8 on Windows

Changes

  • The output of the score_matrix() method of sequence.align.SubstitutionMatrix is not writable anymore, rendering a SubstitutionMatrix truly immutable
  • Renamed environment.yaml to environment.yml
  • A sequence.Feature must have at least one location

Fixes

  • Fixed incorrect centroid calculation in structure.superimpose(), when providing a boolean mask
  • Fixed installation of PyPI source distributions
  • Fixed issues when reading text files with \r\n line breaks (line breaks with carriage return, typical for Windows)

Biotite 0.19.2

26 Dec 14:12
7419ba3

Choose a tag to compare

Changelog

Fixes

  • Fixed examples in gallery

Biotite 0.19.1

24 Dec 11:59
fb0f96a

Choose a tag to compare

Changelog

Additions

  • FASTA files can be downloaded from RCSB PDB via database.rcsb
  • Added structure.rotate_about_axis() and structure.align_vectors()
  • Added shape property and copy() method to structure.Atom
  • All array-like objects can be used to set an annotation array in an atom array (stack)
  • Added structure.info.residue() for getting the standard atoms and their coordinates for a given residue name
  • Added structure.graphics.plot_atoms() for interactive molecular visualization
  • Added exclusive_stop parameter to structure.get_residue_starts and structure.get_chain_starts
  • Added connect_via_residue_names() and connect_via_distances() for calculating a structure.BondList for a structure.AtomArray

Changes

  • structure.rotate() does not rotate the box of an atom array (stack) anymore
  • structure.BondList equality is not order dependent

Fixes

  • structure.BondList accepts all dtypes for integer arrays
  • structure.BondList accepts negative integers as indices
  • sequence.io.fasta.FastaFile: Tests for invalid or empty files
  • structure.io.pdb.PDBFile: Exception is raised if an invalid field in extra_fields is given
  • structure.rotate(): Fixed rotation direction

Biotite 0.18.0

20 Nov 15:12

Choose a tag to compare

Changelog

Additions

  • Added shape property to structure.AtomArray() and
    structure.AtomArrayStack()
  • structure.Atom() has default values for annotation arrays
  • The functions structure.rmsf(), structure.rmsd() and structure.average()
    accept directly coordinates
  • Added use_author_fields parameter to structure.io.pdbx.get_structure(),
    that allows to decide between the usage of label_xxx and auth_xxx fields
  • Added chunk_size parameter to read() method of trajectory files to
    resolve memory issues
  • Added density() function for calculating atom densities.
  • Added sequence.align.get_pairwise_sequence_identity()
  • API reference shows source files of Cython modules

Changes

  • The module name (__module__ attribute) of functions/classes are
    changed to the name of the respective Biotite subpackage
    (e.g. biotite.structure.atoms to biotite.structure)
  • Changed handling of PDB insertion codes:
    • Atoms with insertion codes are not filtered out
    • Removed insertion_code parameter in
      biotite.structure.io.xxx.get_structure()
    • New mandatory annotation category ins_code
    • Changed structure.filter_inscode_and_altloc() to
      structure.filter_altloc()

Fixes

  • The step parameter in the read() method of trajectory files does not
    increase the stop frame
  • Negative residue IDs are handled correctly by structure file readers/writers
  • Fixed issues with indexing behavior in sequence.align.Alignment class
  • structure.remove_pbc() raises proper error message when box is missing
    in the given atom array (stack)
  • sequence.align.align_multiple() raises proper error message, if
    pairwise distance cannot be calculated due to great sequence dissimilarity
  • In sequence.io.genbank.get_annotation() qualifier keys without values
    (e.g. /pseudo) are handled properly
  • Added pyproject.toml specifying build dependencies for setup.py

Biotite 0.17.0

20 Sep 12:59
aa1b1f9

Choose a tag to compare

Changelog

Additions

  • Support for hybrid-36 encoding in structure.io.pdb.PDBFile
  • Added get_coord() method in structure.io.pdb.PDBFile for efficiently reading only the coordinates from a file
  • structure.CellList can be configured to put only a subset of atoms into the cells via the selection parameter
  • Improved functionalities in database subpackage.
    • A lot of new query types in database.rcsb
    • The min and max parameter of some database.rcsb queries are now optional
    • database.rcsb.fetch() and database.entrez.fetch() are able to write the downloaded files into a file-like object instead of writing the file to hard drive
    • database.entrez.fetch() properly checks for invalid responses from server based on https://github.com/kblin/ncbi-entrez-error-messages
    • database.entrez.fetch() also supports common database names
    • database.entrez.SimpleQuery also supports abbreviated field names
  • structure.io.load_structure() and structure.io.save_structure() support keyword arguments that are forwarded to the respective read() or get_structure() method.

Changes

  • database subpackage raises database.RequestError objects when the server gives an invalid response

Fixes

  • Fixed cross references in the API reference
  • sequence.io.genbank.GenBankFile raises a warning instead of an exception if the feature's location identifier is not understood and skips the feature
  • structure.io.pdb.PDBFile properly checks whether all models have the same amount of atoms, when building a structure.AtomArrayStack

Biotite 0.16.0

16 Aug 13:31
bed7b6d

Choose a tag to compare

Changelog

Additions

  • New alignment color schemes
    • Color schemes for protein sequence alignments created with Gecos software
      • Including a color scheme adapted for red-green blindness
    • Color scheme for protein block sequence alignments created with Gecos software
    • Color schemes for protein sequence alignments adapted from JalView
  • More functionalities for external MSA software (application.MSAApp subclasses)
    • Additional CLI options can be set via add_additional_options()
    • The executed command of application.LocalApp can be optained via get_coomand()
    • Most MSA software interfaces allow setting and getting the distance matrix and the guide tree
      • The corresponding method are get_guide_tree(), set_guide_tree(), get_distance_matrix() and set_distance_matrix()
    • MSA software supporting cutom substitution matrices can be used to align almost any type of sequence, even if the type is not directly supported by the underlying software
  • Added euality operator for sequence.align.Alignment objects
  • sequence.phylo.Tree supports non-binary trees
    • sequence.phylo.TreeNode can handle more than two child nodes
    • len() gives amount of leaves in sequence.phylo.Tree
    • sequence.phylo.Tree and sequence.phylo.TreeNode support hash and equality operator
    • sequence.phylo.as_binary() function converts non-binary tree into binary tree, as required for guide trees
  • Added sequence.phylo.neighbor_joining() for hierarchical clustering

Changes

  • Removed X as symbol for ambiguous nucleotides, use N instead
  • Removed protected method get_default_bin_path() from application.MSAApp
  • Renamed protected method set_options() to set_arguments() application.LocalApp
  • Renamed set_matrix() to set_substitution_matrix() application.muscle.MuscleApp
  • Removed protected method get_cli_arguments() in application.LocalApp
  • Adapted constructor of sequence.phylo.TreeNode for variable amount of child nodes
  • application.MSAApp subclasses must implement abstract static methods describing which sequence types they support and whether they support custom substitution matrices

Fixes

  • U is automatically converted to T when loading nulceotide sequences from FASTA files
  • Score matrix in sequence.align.SubstitutionMatrix is now truly read-only via ndarray flag
  • application.Application subclasses (all external software interfaces) now properly check whether the corresponding objects are in the correct application.AppState
  • Error in evaluation step of application.Application now leaves application in application.AppState.CANCELLED state
  • Fixed InvalidFileError not being exposed to user
  • Symmetry checks in sequence.phylo.upgma() allow for small rounding errors