Friday, 15 October 2021

Counting DNA Nucleotides (Rosalind | English)

  

rosalind



Hello everyone, how are you? In this time i want to talk with you about bioinformatics. You're still remember, right? My first article about bioinformatics. In that article I explained about bioinformatics and this application in everyday life. You can call it "the introduction article". At this article (and this following maybe) i want to start to discuss about various bioinformatics problems that at this time put from a web that quite popular in the bioinformatics community, rosalind.info

Web rosalind is the web that provide various bioinformatics problem that can be used for train our skills that be needed for studying bioinformatics. Various skills, like python language, algorithm, and bioinformatics application will be learned here. 

Problem 01

Okay, lets jump into the first problem of this series. For the reference, you can first check the problem that will be discussed out (here). 

The title is "Counting DNA Nucleotides". Like the name, in this problem we will try to counting nucleotides place in DNA ('A', 'C', 'G', and 'T'). 'A' itself is stands for adenine, then 'C' for cytosine, 'G' for guanine, and 'T' for thymine

This problem is quite simple. Here we will given a DNA string composed from 4 characters in uppercase ('A', 'C', 'G', and 'T'). Then we ordered to counting how many numbers of this adenine ('A'), cytosine ('C'), guanine ('G'), and thymine (T). 

There is the code for solving this problem (in java language):

public void solve() {
String s = in.read();
int A = 0;
int C = 0;
int G = 0;
int T = 0;
for (char ch : s.toCharArray()) {
  if (ch == 'A') {
A++;
  }
  else if (ch == 'C') {
C++;
  }
  else if (ch == 'G') {
G++;
  }
  else if (ch == 'T') {
T++;
  }
}
out.println(A + " " + C + " " + G + " " + T);
}

Like what you see, the code is quite simple. We just reading that character (char) of s one by one and counting the number of each character ('A', 'C', 'G', and 'T'). 

And for the output i used function out.println(). That function is modification from function System.out.println() that is very familiar in java. You can see the additional code for that modification in my complete code at github

If you asking me why I used java, the answer is because I like to write in java, I like the code structure in java. And in rosalind, we don't have to use the certain programming language to answering the problem because here the requested answer is in the text form not the code form. Thus, you can pick any language whatever you want (or can) as long as your code are right. 

That is true, but the bioinformatics world itself have had the most familiar and most used (in my opinion) language, it's called python. And the rosalind's web itself has a particular problem genre that specific for learning the python language. Thus, if you new in bioinformatics or want to learning python you can visit at rosalind and do learn with us together. 

That's from me. If you want to ask something, you can write it in the comment section below. I hope this article is useful and see you in the next article! 


Reference :
Source of image 1 :https://en.wikipedia.org/wiki/Nucleic_acid_secondary_structure

No comments:

Post a Comment