DNA-sequences
README.md

DNA-sequences

Dynamic library that is able to store a DNA-sequence as a string, dynamic array or a pointer list to perform certain methods on it, such as concatenate and slice, using the concepts of an abstract base class. There is also a written testprogram for testing the dynamic library on some text files with DNA-sequences and variants included. The types for the dynamic array and pointer list are integers, the sizes (bits) of these integers can be decided by the user.

This project was a school project made in a group and on a git of the University. I therefore copied the whole project to this repository. The language of the program, comments and (most) variables are in Dutch, since the course of this project was also in Dutch. If anything is unclear feel free to contact me.

Variants

The methods we can perform on the DNA-sequences returns a product, which we call variants. These methods are typical for working with DNA-sequences. The methods are listed below, alongside their syntax that should be used as input to apply these methods.

  • Deletion ("del")
  • Inversion ("inv")
  • Insertion ("ins)
  • Substition (">")
  • Deletion-Insertion ("delins")

A variant consists of "[start] [end] [type] ([sequence])", some examples can be found in the testvariants directory. A sequence should be in capital letters (and only containing A, T, C and G) and a sequence is counted starting from 0 and not 1.

Testprogram

The testprogram consists of a menu that first asks what type should be used to store the DNA-sequences (string, dynamic array, or pointer list), afterwards a text file should be given containing the first DNA-sequence to work with. When this is done we get a repeating menu in which we can perform some actions. The most interesting action is applying a text file with variants on the currently stored DNA-sequences. Some text files with sequences and variants are already given to make it easier for testing.

My contribution

This project was made with a classmate, I therefore did not program all the code in this project. The following is my contribution:

  • seqarray.h
  • deletie[.h]|.cc
  • delinsertie[.h]|.cc
  • insertie[.h]|.cc
  • inversie[.h]|.cc
  • substitutie[.h]|.cc
  • variant[.h]|.cc
  • apply function in sequentie[.h]|.cc
  • testprogramma.cc

The following I added myself to make it easy for others to run and test the library:

  • testsequences (directory with 3 files of DNA-sequence)
  • testvariants (directory with 3 files of variants)
  • makelibrary.sh
  • maketestprogram.sh
  • cleanupall.sh (little extra script for cleaning up)