jdna

jdna.sequence

Represent linear or circularized nucleotides.

Classes

BindPos(template_bounds, query_bounds, …)

Makes a sequence binding position.

Feature(name[, type, strand, color])

An annotation for a sequence.

Nucleotide(base[, alphabet])

Represents a biological nucleotide.

Sequence([sequence, first, name, …])

Represents a biological sequence as a double linked list.

SequenceFlags

Constants/Flags for sequences.

class jdna.sequence.BindPos(template_bounds: Tuple[Nucleotide, Nucleotide], query_bounds: Tuple[Nucleotide, Nucleotide], template: jdna.sequence.Sequence, query: jdna.sequence.Sequence, direction: int, strand=<SequenceFlags.FORWARD: 1>)[source]

Bases: jdna.linked_list.LinkedListMatch

Makes a sequence binding position.

Parameters
  • template_bounds_list (template DoubleLinkedList) – list of 2 len tuples containing starts and ends from a template

  • query_bounds_list (query DoubleLinkedList) – list of 2 len tuples containing starts and ends from a query

  • template (DoubleLinkedList) – the template

  • query (DoubleLinkedList) – the query

  • direction (int) – If SequenceFlags.FORWARD, the binding position indicates binding forward, to the bottom strand of a dsDNA sequence.

  • strand (int) – If SequenceFlags.BOTTOM, then the query is assumed to be the reverse_complement of the original query

classmethod batch_create(template_bounds_list: List[Tuple[jdna.linked_list.Node, jdna.linked_list.Node]], query_bounds_list: List[Tuple[jdna.linked_list.Node, jdna.linked_list.Node]], template: jdna.linked_list.DoubleLinkedList, query: jdna.linked_list.DoubleLinkedList) → List[jdna.linked_list.LinkedListMatch]

Efficiently create several LinkedListMatches from lists of template starts/ends and query starts/ends.

Parameters
  • template_bounds_list (template DoubleLinkedList) – list of 2 len tuples containing starts and ends from a template

  • query_bounds_list (query DoubleLinkedList) – list of 2 len tuples containing starts and ends from a query

  • template (DoubleLinkedList) – the template

  • query (DoubleLinkedList) – the query

Returns

matchese

Return type

list of LinkedListMatch

classmethod from_match(linked_list_match, template, query, direction, strand=<SequenceFlags.FORWARD: 1>)[source]

Return a binding pos.

Parameters

linked_list_match (LinkedListMatch) – the linked list match

Returns

Return type

class jdna.sequence.Feature(name, type=None, strand=None, color=None)[source]

Bases: object

An annotation for a sequence.

class jdna.sequence.Nucleotide(base, alphabet=<jdna.alphabet.Alphabet object>)[source]

Bases: jdna.linked_list.Node

Represents a biological nucleotide.

Serves a Node in teh Sequence object.

Nucleotide constructor.

Parameters

base (basestring) – base as a single character string

_break_connections()

Break connections in this node.

Returns

Return type

_complete_match(node: jdna.linked_list.Node, next_method: Callable) → bool

Return whether the longest match between two nodes is equivalent.

Parameters
  • node (Node) – the node to compare

  • next_method (callable) – how to obtain the next node

Returns

whether the longest match between two nodes is equivalent

Return type

bool

add_next(data)

Create a new node and add to next.

Parameters

data (any) – any data

Returns

the new node

Return type

Node

add_prev(data)

Create a new node and add to previous.

Parameters

data (any) – any data

Returns

the new node

Return type

Node

cut_next()[source]

Cut the next node, return the cut node.

Returns

the cut (next) node

Return type

Node

cut_prev()[source]

Cut the previous node, return the cut node.

Returns

the cut (previous) node

Return type

Node

equivalent(other) → bool[source]

Evaluates whether two nodes hold the same data.

find_first() → jdna.linked_list.Node

Find the head node.

Returns

Return type

find_last() → jdna.linked_list.Node

Find the tail node.

Returns

Return type

fwd(stop_node: Optional[jdna.linked_list.Node] = None, stop_criteria: Callable = None) → Generator

Propogates forwards until stop node is visited or stop criteria is reached.

Parameters
  • stop_node

  • stop_criteria

Returns

Return type

longest_match(node: jdna.linked_list.Node, next_method: Callable = None) → Tuple[jdna.linked_list.Node, jdna.linked_list.Node]

Find the longest match between two linked_lists.

Parameters
  • node (Node) – the node to compare

  • next_method (callable) – how to obtain the next node

Returns

list of tuples containing matching nodes

Return type

list

next()

Return the next node.

Returns

the next node

Return type

Node

prev()

Return the previous node.

Returns

the previous node

Return type

Node

classmethod random()[source]

Generate a random sequence.

remove()

Remove node from linked list, connecting the previous and next nodes together.

Returns

None

Return type

None

rev(stop_node: Optional[jdna.linked_list.Node] = None, stop_criteria: Callable = None) → Generator[jdna.linked_list.Node, None, None]

Propogates backwards until stop node is visited or stop criteria is reached.

Parameters
  • stop_node

  • stop_criteria

Returns

Return type

set_next(nucleotide)[source]

Set the next node.

Parameters

node

Returns

Return type

set_prev(nucleotide)[source]

Set the previous node.

Parameters

node

Returns

Return type

swap()

Swap the previous and next nodes.

Returns

None

Return type

None

class jdna.sequence.Sequence(sequence: Sequence[Any] = None, first: jdna.sequence.Nucleotide = None, name: str = None, description: str = '', metadata: dict = None, cyclic: bool = False, alphabet=<jdna.alphabet.Alphabet object>)[source]

Bases: jdna.linked_list.DoubleLinkedList

Represents a biological sequence as a double linked list.

Can be annotated with features.

Parameters
  • sequence (basestring) – sequence string

  • first (Nucleotide) – optional first Nucleotide to use as the ‘head’ to this Sequence

  • name (basestring) – optional name of the sequence

  • description (basestring) – optional description of the sequence

  • metadata (dict) – additional sequence metadata

  • cyclic (bool) – whether to make the sequence circular

  • alphabet (jdna.alphabet.Alphabet) – the base pair alphabet of this sequence which used for complementary and comparisons (default: AmbiguousDNA)

class DEFAULTS[source]

Bases: object

Sequence defaults.

class Direction

Bases: enum.IntFlag

An enumeration.

NODE_CLASS

alias of Nucleotide

add_feature(start, end, feature)[source]

Add a feature to the start and end positions (inclusive)

Parameters
  • start (int) – start

  • end (int) – end (inclusive)

  • feature (Feature) – the feature to add

Returns

the added feature

Return type

Feature

add_multipart_feature(positions, feature)[source]

Add a multi-part feature (i.e. a disjointed feature)

Parameters
  • positions (list) – list of start and ends as tuples ([(1,100), (110,200)]

  • feature (Feature) – the feature to add

Returns

the added feature

Return type

Feature

anneal(ssDNA, min_bases=13, depth=None)[source]

Simulate annealing a single stranded piece of DNA to a double_stranded template.

anneal_forward(other, min_bases=13, depth=None)[source]

Anneal a sequence in the forward direction.

anneal_reverse(other, min_bases=13, depth=None)[source]

Anneal a sequence in the reverse direction.

annotate(start, end, name, feature_type=None, color=None, strand=None)[source]

Annotate a regions.

Parameters
  • start (int) – start

  • end (end) – end (inclusive)

  • name (basestring) – feature name

  • feature_type (basestring) – feature type (default=misc)

  • color (basestring) – optional feature color

Returns

new feature

Return type

Feature

c()[source]

Complement the sequence in place.

property circular

Alias for cyclic.

static collect_nodes(nodes)

Return all visisted nodes and return an unordered set of nodes.

Returns

all visited nodes

Return type

set

compare(other: jdna.linked_list.DoubleLinkedList) → bool

Compares two linked lists. If both are cyclic, will attempt to reindex.

Parameters

other (DoubleLinkedList) – other linked list

Returns

whether sequence data is equivalent

Return type

bool

complement()[source]

Complement the sequence in place.

copy_slice(start: jdna.linked_list.Node, end: jdna.linked_list.Node) → jdna.linked_list.DoubleLinkedList

Return a copy of the sequence between ‘start’ and ‘end’ nodes.

If start is None, return the slice copy from head to end. If end is None, return copy from start to tail. If both start and end are None return None.

digest(enzymes, as_names=False)[source]

Supply either a Bio.RestrictionSite or a tuple of (seq, cut1, cut2)

e.g. (‘GTTTAAAC’, 4, -4)

Parameters

enzymes (list (of tuple|Bio.RestrictionSite)) – either a Bio.RestrictionSite or a tuple of (seq, cut1, cut2)

Returns

list of sequences

Return type

list

dsanneal(dsDNA, min_bases=13, depth=None)[source]

Simulate annealing a double stranded piece of DNA to a double_stranded template.

property features

Return a list of feature positions.

Parameters

with_nodes (bool) – if True, will return a tuple composed of a feature to position dictionary and a feature to start and end node. If False, will just return a feature to position dictionary

Returns

feature positions dictionary OR tuple of feature positions dictionary and feature node dictionary

Return type

tuple

property features_list

Returns set of features contained in sequence.

Returns

set of features in this sequence

Return type

set

static find_ends(nodes)

Efficiently finds the head and tails from a group of nodes.

find_feature_by_name(name)[source]

Find features by name.

Parameters

name (basestring) – feature name

Returns

list of features

Return type

list

find_iter(query: jdna.linked_list.DoubleLinkedList, min_query_length: int = None, direction: int = <Direction.FORWARD: 1>, protocol: Callable = None, depth: int = None)

Iteratively finds positions that match the query.

Parameters
  • query (DoubleLinkedList) – query list to find

  • min_query_length (inst) – the minimum number of matches to return. If None, find_iter will only return complete matches

  • direction (int or tuple) – If Direction.FORWARD (+1), find iter will search from the query head and search forward, potentially leaving a ‘tail’ overhang on the query. If Direction.REVERSE (-1), find iter will search from the query tail and search reverse, potentially leaving a ‘head’ overhang on the query. If a tuple, the template_direction and query_direction are set respectively.

  • protocol (callable) – the callable taking two parameters (as Node) to compare during find. If None, defaults to ‘equivalent’

Returns

list of LinkedListMatches

Return type

list

inclusive_range(i: int, j: int) → Generator[jdna.linked_list.Node, None, None]

Return generator for inclusive nodes between index i and j.

If i is None, assume i is the head node.

json()[source]

Print sequence to a json dictionary.

classmethod load(data)[source]

Load a sequence from a json formatted dictionary.

classmethod new_slice(start: jdna.linked_list.Node, end: jdna.linked_list.Node) → jdna.linked_list.DoubleLinkedList

Return a copy of the sequence between ‘start’ and ‘end’ nodes)

node_range(start: jdna.linked_list.Node, end: jdna.linked_list.Node) → Generator[jdna.linked_list.Node, None, None]

Iterate between ‘start’ to ‘end’ nodes (inclusive)

print(indent=None, width=None, spacer=None, complement=False, features=True, **kwargs)[source]

Create and print a SequenceViewer instance from this sequence. Printing the view object with annotations and complement will produce an output similar to the following:

> "Unnamed" (550bp)


                                                            ----------------GFP----------------
                                                            |<START
                                                            ----      -----------RFP-----------
0         CCCAGGACTAGCGACTTTCCGTAACGCGACCTAACACCGGCCGTTCCTTCGAGCCAGGCAAATGTTACGTCACTTCCTTAGATTT
          GGGTCCTGATCGCTGAAAGGCATTGCGCTGGATTGTGGCCGGCAAGGAAGCTCGGTCCGTTTACAATGCAGTGAAGGAATCTAAA

          ------GFP------
          -----------------------------------------RFP-----------------------------------------
85        TGAACAGCGCCGTACCCCGATATGATATTTAGATATATAGCAGTTACACTTGGGGTTGCTATGGACTTAGATCTGCTGTATGTTT
          ACTTGTCGCGGCATGGGGCTATACTATAAATCTATATATCGTCAATGTGAACCCCAACGATACCTGAATCTAGACGACATACAAA

          -----------------------------------------RFP-----------------------------------------
170       TCTTACCTTCCGCATCAGGGGACAATTCGCCAGTAGAATTCAGTTTGTGCGTGAGAACATAAGATTGAATCCCACGCAGGCACAA
          AGAATGGAAGGCGTAGTCCCCTGTTAAGCGGTCATCTTAAGTCAAACACGCACTCTTGTATTCTAACTTAGGGTGCGTCCGTGTT

          ---------------------RFP----------------------
255       GCAGGGCGGGCAGACTCTATAGGTCCTAAGACCCTGAGACTGCGTCCTCAAGATACAGGTTAACAATCCCCGTATGGAGCCGTTC
          CGTCCCGCCCGTCTGAGATATCCAGGATTCTGGGACTCTGACGCAGGAGTTCTATGTCCAATTGTTAGGGGCATACCTCGGCAAG

340       TTAGCATGACCCGACAGGTGGGCTTGGCTCGCGTAAGTTGAGTGTTGCAGATACCTGCTGCTGCGCGGTCTAGGGGGAATCGCCG
          AATCGTACTGGGCTGTCCACCCGAACCGAGCGCATTCAACTCACAACGTCTATGGACGACGACGCGCCAGATCCCCCTTAGCGGC

425       ATTTTGACGTAGGATCGGTAATGGGCAGTAAACCCGCAACTATTTTCAGCACCAGATGCAAGTTTCCCTAGAAAGCGTCATGGTT
          TAAAACTGCATCCTAGCCATTACCCGTCATTTGGGCGTTGATAAAAGTCGTGGTCTACGTTCAAAGGGATCTTTCGCAGTACCAA

510       TGCAATCTCCTTAGGTCACAGCAAACATAGCAGCCCCTGT
          ACGTTAGAGGAATCCAGTGTCGTTTGTATCGTCGGGGACA
Parameters
  • indent (int) – indent between left column and base pairs view windo

  • width (int) – width of the view window

  • spacer (basestring) – string to intersperse between sequence rows (default is newline)

  • complement (bool) – whether to include the complementary strand in the view

  • include_annotations (bool) – whether to include annotations/features in the view instance

Returns

the viewer object

Return type

SequenceViewer

classmethod random(length)[source]

Generate a random sequence.

range(i: int, j: int) → Generator[jdna.linked_list.Node, None, None]

Returns an iterator from node at ‘i’ to node at ‘j-1’.

rc()[source]

Reverse complement the sequence in place.

reverse_complement()[source]

Reverse complement the sequence in place.

tm()[source]

Calculate the Tm of this sequence using primer3 defaults.

Returns

the tm of the sequence

Return type

float

view(indent=10, width=85, spacer=None, complement=False, features=True, **kwargs)[source]

Create a SequenceViewer instance from this sequence. Printing the view object with annotations and complement will produce an output similar to the following:

> "Unnamed" (550bp)


                                                            ----------------GFP----------------
                                                            |<START
                                                            ----      -----------RFP-----------
0         CCCAGGACTAGCGACTTTCCGTAACGCGACCTAACACCGGCCGTTCCTTCGAGCCAGGCAAATGTTACGTCACTTCCTTAGATTT
          GGGTCCTGATCGCTGAAAGGCATTGCGCTGGATTGTGGCCGGCAAGGAAGCTCGGTCCGTTTACAATGCAGTGAAGGAATCTAAA

          ------GFP------
          -----------------------------------------RFP-----------------------------------------
85        TGAACAGCGCCGTACCCCGATATGATATTTAGATATATAGCAGTTACACTTGGGGTTGCTATGGACTTAGATCTGCTGTATGTTT
          ACTTGTCGCGGCATGGGGCTATACTATAAATCTATATATCGTCAATGTGAACCCCAACGATACCTGAATCTAGACGACATACAAA

          -----------------------------------------RFP-----------------------------------------
170       TCTTACCTTCCGCATCAGGGGACAATTCGCCAGTAGAATTCAGTTTGTGCGTGAGAACATAAGATTGAATCCCACGCAGGCACAA
          AGAATGGAAGGCGTAGTCCCCTGTTAAGCGGTCATCTTAAGTCAAACACGCACTCTTGTATTCTAACTTAGGGTGCGTCCGTGTT

          ---------------------RFP----------------------
255       GCAGGGCGGGCAGACTCTATAGGTCCTAAGACCCTGAGACTGCGTCCTCAAGATACAGGTTAACAATCCCCGTATGGAGCCGTTC
          CGTCCCGCCCGTCTGAGATATCCAGGATTCTGGGACTCTGACGCAGGAGTTCTATGTCCAATTGTTAGGGGCATACCTCGGCAAG

340       TTAGCATGACCCGACAGGTGGGCTTGGCTCGCGTAAGTTGAGTGTTGCAGATACCTGCTGCTGCGCGGTCTAGGGGGAATCGCCG
          AATCGTACTGGGCTGTCCACCCGAACCGAGCGCATTCAACTCACAACGTCTATGGACGACGACGCGCCAGATCCCCCTTAGCGGC

425       ATTTTGACGTAGGATCGGTAATGGGCAGTAAACCCGCAACTATTTTCAGCACCAGATGCAAGTTTCCCTAGAAAGCGTCATGGTT
          TAAAACTGCATCCTAGCCATTACCCGTCATTTGGGCGTTGATAAAAGTCGTGGTCTACGTTCAAAGGGATCTTTCGCAGTACCAA

510       TGCAATCTCCTTAGGTCACAGCAAACATAGCAGCCCCTGT
          ACGTTAGAGGAATCCAGTGTCGTTTGTATCGTCGGGGACA
Parameters
  • indent (int) – indent between left column and base pairs view windo

  • width (int) – width of the view window

  • spacer (basestring) – string to intersperse between sequence rows (default is newline)

  • complement (bool) – whether to include the complementary strand in the view

  • features (bool) – whether to include annotations/features in the view instance

Returns

the viewer object

Return type

SequenceViewer

class jdna.sequence.SequenceFlags[source]

Bases: enum.IntFlag

Constants/Flags for sequences.