← Back to portfolio
PDF

Zahera (Fathima) Khatoon

Austin, TX

MS Bioinformatics Candidate · Northeastern University

Summary

MS Bioinformatics candidate with hands-on experience in NGS analysis (FASTQ → VCF), ML for biological data, structural modeling, and reproducible pipelines on HPC (SLURM). Comfortable implementing core algorithms (Smith–Waterman, Needleman–Wunsch, UPGMA, ORF detection) from scratch in Python.

Education

Northeastern UniversityM.S. in Bioinformatics · Boston, MA · Remote

Sep 2025 – May 2027 (Expected)

Coursework: BINF 6200 (Bioinformatics Programming), BINF 6400 (Genomics & Computational Biology).

Osmania UniversityPostgraduate Diploma in Bioinformatics · Hyderabad, India

2021 – 2022

Sequence analysis, structural biology (homology modeling, docking), computational genomics.

Osmania UniversityB.Sc. in Microbiology, Biotechnology & Chemistry · Hyderabad, India

2005 – 2008

Experience

Medical Laboratory TechnicianSunflower Memory Care · Cedar Park, TX

Apr 2024 – May 2024

  • Performed routine clinical lab procedures: specimen collection, processing, and analysis.
  • Maintained HIPAA-compliant documentation; coordinated with staff for accurate test reporting.

Academic Projects

Cell Segmentation ML PipelineBioHack 2026 · Northeastern

Feb 2026

Deep-learning microscopy segmentation (Dice ≈ 0.87); reduced manual annotation time ~60%.

Stack: Python, PyTorch, OpenCV

UPGMA Phylogenetics + ClassifierBINF 6400 · Northeastern

Mar 2026

Implemented UPGMA from scratch; trained classifier on sequence features for taxonomic grouping.

Stack: Python, NumPy, Biopython, scikit-learn

NGS QC on Explorer HPCBINF 6400 · Northeastern

Feb 2026

SLURM-scheduled QC workflow; retained >95% high-quality reads after adapter/base trimming.

Stack: Bash, SLURM, FastQC, MultiQC, Trimmomatic

Genome Assembly + ORF DetectionBINF 6400 · Northeastern

Feb 2026

Assembled short reads and wrote a custom ORF detector; validated against reference annotations.

Stack: Python, SPAdes, Biopython

CCDS Python PackageBINF 6200 · Northeastern

Sep – Dec 2025

Reusable CCDS analysis package with 96% test coverage and 10/10 pylint; CI via GitHub Actions.

Stack: Python, pytest, pylint, GitHub Actions

Pairwise Alignment + PSSMBINF 6200 · Northeastern

Sep – Dec 2025

Implemented Smith–Waterman, Needleman–Wunsch, and PSSM scoring from scratch.

Stack: Python, NumPy

GLT6D1 Homology ModelingThesis · Osmania University

2024 – 2025

>92% residues in favored Ramachandran regions; top docking pose ≈ −8.2 kcal/mol.

Stack: MODELLER, PyMOL, AutoDock Vina

Technical Skills

Languages: Python · R · Bash · SQL · C · C++

Bioinformatics: Biopython · BLAST · Clustal Omega · MODELLER · PyMOL · AutoDock Vina · FastQC · MultiQC · Trimmomatic · BWA · SAMtools · GATK · SPAdes · IGV

NGS & Genomics: FASTQ → VCF · Read QC & trimming · Variant calling · Genome assembly · ORF detection · Phylogenetics (UPGMA) · PSSM · Pairwise / MSA

ML & Engineering: PyTorch · scikit-learn · NumPy · Pandas · OpenCV · Git / GitHub · GitHub Actions · pytest · pylint · SLURM / HPC · Linux

Platforms & Formats: Explorer HPC · Jupyter · VS Code · FASTA · FASTQ · SAM/BAM · VCF · PDB · GFF/GTF