postdoc-projects

Machine Learning algorithm to identify bio-signatures from tuberculosis-exposed and non-exposed individuals.

Biology is a complex system and human with their more than three billion base-pairs is of course a too complex system. The story doesn’t end with the number of base-pairs or nucleotides, within it there is another large more complexity society where everybody has got some works to support those three billion system.

The story of Epigenetics not only tells the action of biological macromolecules, it also learns and modifies itself with the surroundings. In the computer system, the advanced algorithm helps to learn from various kind of data and apply them to another set of the data to generate or validate a hypothesis.

With the Big data, human minds need years to read a set of data whereas the computer can read it faster and along with the algorithm it can also understands the pattern of the data to validate the hypothesis. Artificial Intelligent after all can thus reduce the work load for biologists with the proper design of the algorithm.

Why do this?

  • Tuberculosis, with the long co-evolutionary history with mankind and more than one million death worldwide per year, is one of the infectious diseases which has only one known vaccine till date. Identification of new bio-signature can show pave to the development of a new vaccine.

  • Machine learning is used to develop the biosignature from the epigenetics data.

Related publication:
Das, J., Idh, N., Pehrson, I., Paues, J., & Lerm, M. (2021). A DNA methylome biosignature in alveolar macrophages from TB-exposed individuals predicts exposure to mycobacteria. medrxiv.
Link to the publication

Lerm, M., Das, J. (2022) Biomarker for detection of mycobacterial exposure and infection. WO patent. WO2022119495A1

Identification of Differential Methylation patterns of tuberculosis patients, household contacts and healthy participants.

DNA methylation is one of the epigenetic changes that regulates the function of the genes in human as well as other living organisms. The cytosine base of the DNA molecules sometimes get methylated with one methyl (-CH3) group at its 5′ carbon molecule. In a particular position, the methylation varies due to their different presence in different cells.

Array-based methylation using Illumina or genome-wide methylation analysis generates a lot of data which requires strong computational approaches to identify the differential methylation patterns between the sample groups. The array-based methylation value is calculated mainly using the 𝛃 value which is a ratio of methylation value over the unmethylation value with a static coefficient. The mathematical equation to calculate the 𝛃 value is:

where M = methylated value; UM = Unmethylated value, c= constant =100; -1 ≤ 𝛃 ≤ 1 With other epigenetic modifiers like Histone modifications, microRNAs, the main role of the DNA methylation is to regulate the gene expression levels and that also reflects to the protein expressions.

The cytosine with methyl base and a guanine molecule creates a stretch of similar CpG base-pairs that is known as the CpG islands. The stretch can be short or long to several hundred base-pairs together. These CpG islands dispersed over transcription start sites (TSS), promoters, intergenic regions and also gene body regions.

Why do this?

DNA methylation is variable and also depends on the external environments. Exposure to different environmental conditions changes the pattern of the methylation. In this project, the hypothesis is that there is a different methylation pattern between individuals when they are either exposed or non-exposed to tuberculosis. The difference in the methylation pattern among differently exposed individuals can lead to the identification of the responsible genes or pathways that helps to the fast tuberculosis diagnosis as well as the probable treatment procedures.

Related publication:
Pehrson, I., Braian, C., Karlsson, L., Idh, N., Danielsson, E.K., Andersson, B., Paues, J., Das, J. and Lerm, M., 2021. DNA methylation profiling of immune cells from tuberculosis-exposed individuals overlaps with BCG-induced epigenetic changes and correlates with the emergence of anti-mycobacterial’corralling cells’. medRxiv.
Link to the publication

Pehrson, I., Das, J., Idh, N., Karlsson, L., Rylander, H., af Segerstad, H.H., Reuterswärd, E., Marttala, E., Paues, J., Méndez-Aranda, M. and Ugarte-Gil, C., 2021. DNA methylomes derived from alveolar macrophages display distinct patterns in latent tuberculosis-implication for interferon gamma release assay status determination. MedRxiv.   Link to the publication

Reduced Representation of Bisulfite Sequencing (RRBS) data analysis from tuberculosis-exposed samples in different cell types.

A number of methods are available to identify the genome-wide DNA methylation in human and other organisms. Reduced representation of bisulfite sequencing (RRBS) is one of these methods to identify the genome-wide DNA methylation using the MspI enzyme library preparation and whole-genome sequencing.

Bisulfite converted DNA preserves the cytosine bases which are methylated in the genome, but converts the non-methylated cytosine residues to thymine residue and that procedure results to the easy identification of the methylated sites in the genome. The standard procedure follows the library preparation of bisulfite converted DNA from a small input samples (100 ng) and then use the library in the sequencer with the nucleotide molecules to generate the small stretch of sequences saved in the fastq format.

Why do this?

Genome-wide methylation analysis reveals the intergenic regions along with the genic regions and also identifies methylation patterns other than CpG islands. These illustrates huge scope to identify the new features of the causing disease patterns among the individuals.

Related publication:
Karlsson L, Das J, Nilsson M, Tyrén A, Pehrson I, Idh N, Sayyab S, Paues J, Ugarte-Gil C, Méndez-Aranda M, Lerm M. A differential DNA methylome signature of pulmonary immune cells from individuals converting to latent tuberculosis infection. Scientific reports. 2021 Sep 30;11(1):1-3.   Link to the publication

SARS-CoV-2 and DNA Methylation

GitHub link

Related publication:
Huoman J, Sayyab S, Apostolou E, Karlsson L, Porcile L, Rizwan M, Sharma S, Das J, Rosén A, Lerm M. Epigenetic rewiring of pathways related to odour perception in immune cells exposed to SARS-CoV-2 in vivo and in vitro. Epigenetics. 2022 Jun 26:1-7.   Link to the publication

Maria Lerm, Jyotirmoy Das, Shumaila Sayyab. Method for determining sars-cov-2 exposure with or without remaining symptoms. 2022. WO2022119495A1

Image Processing with MATLAB

GitHub link

Related publication:
Kalsum S, Andersson B, Das J, Schön T, Lerm M. A high-throughput screening assay based on automated microscopy for monitoring antibiotic susceptibility of Mycobacterium tuberculosis phenotypes. BMC microbiology. 2021 Dec;21(1):1-4.