Thursday, 26 May 2022

Transitions and Transversions (Rosalind | English)

       

   rosalind


Hi everyone, how are you? This time, I want to discuss a little bit about bioinformatics problem that exists in rosalind.info's web. The title is "Transitions and Transversions". For the reference, you can first check out the problem that will be discussed (here). 

Overview 

In this problem we will given 2 strings formed in FASTA format that represent "the string before changing" and "the string after changing", let's call them s1 and s2. Our task is to find the ratio of transition and transversion of both strings. 

What is the transition and transversion? 

The transition itself is the changing from one purine molecule into another purine molecule (A -> G, G -> A), or one pyrimidine molecule into another pyrimidine molecule (C -> T, T -> C). While the transversion is the changing from the purine molecule into pyrimidine molecule or vice versa (for example, A -> C). 

The idea for solving this problem is, first, we find the differentiation between those strings. Then, based on those differentiation we determine which one is transition and which one is transversion. 


Transition and transversion.


The Code

This is the code for solving this problem using java language: 
  1. static void solve() {
  2. Scanner sc = new Scanner(System.in);
  3. String s1 = "";
  4. String s2 = "";
  5. boolean ganti = false;
  6. sc.next();
  7. while (sc.hasNext()) {
  8. String masuk = sc.next();
  9. if (masuk.charAt(0) == '>') {
  10. ganti = true; continue;
  11. }
  12. if (ganti) {
  13. s2 += masuk;
  14. }
  15. else {
  16. s1 += masuk;
  17. }
  18. }
  19. int ts = 0;
  20. int tv = 0;
  21. HashMap<Character, Character> pair = new HashMap<>();
  22. pair.put('A', 'G');
  23. pair.put('G', 'A');
  24. pair.put('C', 'T');
  25. pair.put('T', 'C');
  26. for (int i = 0; i < s1.length(); i++) {
  27. if (s1.charAt(i) == s2.charAt(i)) {
  28. continue;
  29. }
  30. if (pair.get(s1.charAt(i)) == s2.charAt(i)) {
  31. ts++;
  32. }
  33. else {
  34. tv++;
  35. }
  36. }
  37. out.printf("%.11f", 1.0 * ts / tv);
  38.     
The Code Description

First of all, we make a map consist pairs of molecules based on purine-pyrimidine rule ('A' with 'G', 'C' with 'T') like in line 21 to 25. 

The next step is to count the number of the transition and transversion (line 26-36). The answer of this problem is the result of the number of transition divided by the number of transversion served in decimal form (line 37). 

Input dan Output

As you can see, in the code above I used next() function for entering the string-form dataset (line 8). 

I also used another input function called hasNext() (line 7) because of its ability to entering the unknown amount of data. That function is very suitable to be used for entering a FASTA format-form data. 

While, in the output I used out.printf() function (line 37). That function is a modification function from System.out.printf() which is the default function in java. For more details, you can see the additional code for that modification (input and output) in my complete code at github

That's it. If you want to ask something, you can write it in the comment section below. I hope this article is useful and see you in the next article! 


References :
Source image 1: https://www.facebook.com/ProjectRosalind/
Source image 2: https://en.wikipedia.org/wiki/Transversion

No comments:

Post a Comment