Paper assignment

Written on 23.05.24 by Alexey Gurevich

Nr Paper Student Type
1 deepBGC Lara Hombrecher Proseminar
2 NeuRiPP Sourish Chakraborty Seminar
3 DeepRiPP Adrian Scherhag Proseminar
4 HypoRiPPatlas Paul Zeltner Proseminar
5 ClusterFinder Sizhe Sun Seminar
6 RODEO Vishwa Ujenia Seminar
7 RRE-Finder Kevin Cherian Koshy Seminar
8 decRiPPter Dino Milanovic Seminar
9 CO-ED Mouhammed Soliman Seminar
10 GECCO Taylan Göregen Seminar
11 antiSMASH Anjali Choudhary Seminar
12 PRISM Prafful Sharma Seminar
13 SemPI Mahsa Kazemi Seminar
14 ARTS Laura Petri Seminar
15 BAGEL Khizar Mahmood Seminar
16 eSNaPD Adrian Tello Proseminar



Participants and waiting list are announced

Written on 13.05.24 by Alexey Gurevich

All 34 received applications were ranked into the 16 selected participants and 18 students on the waiting list. 

If you didn't receive your status via email, please contact Alexey Gurevich asap.

Genome Mining (Pro/Seminar)


Genome mining is a computational technique to identify previously uncharacterized natural product biosynthetic gene clusters (BGCs) within the genomes of sequenced organisms. These BGCs are responsible for the synthesis of natural products, many of which carry essential bioactivities, such as antibiotics or anticancer compounds. This block pro-/seminar covers various genome mining tools and methods. Master students who attend the seminar will be required to run the respective genome mining tool on the provided genome sequence, in addition to fulfilling the regular pro-/seminar requirements of presentation and text summary (required for both bachelors and masters).

General information

Tutor: Jun.-Prof. Dr. Alexey Gurevich

Language: English

Registration: email to before 23:59 on 28.04.2024 (also register in LSF to get CPs). Please provide brief information about

  1. Your background/experience,
  2. Previously passed seminars (if any),
  3. A short statement of motivation to attend this seminar (250 words maximum)


The final grade will rely on the following course components:

  • Presentation:
    • Talk of approx. 30 minutes (BSc) / 40 minutes (MSc)
    • Answering the questions from the audience after the presentation
  • Text summary:
    • Short description of your presented topic
    • Ca. 2 pages of text (with or without subsections), excluding title page, references, figures, tables, etc.
    • It is recommended to write the report using LaTeX to train scientific writing (11 pt, 1.5 line spacing)
  • Trying the tool (MSc only!)
  • Participation in the seminar days:
    • Asking questions
    • A short text review of one other student presentation

The component weights are 50%/45% (presentation, BSc/MSc) + 30% (summary) + 20% (participation) + 0%/5% (trying the tool, BSc/MSc).

Useful materials


For BSc and MSc (deep learning-based methods)

1. [deepBGC] A deep learning genome-mining strategy for biosynthetic gene cluster prediction (Nucleic Acids Research, 2019)
2. [NeuRiPP] NeuRiPP: Neural network identification of RiPP precursor peptides (Scientific Reports, 2019)
3. [DeepRiPP] DeepRiPP integrates multiomics data to automate discovery of novel ribosomally synthesized natural products (PNAS, 2019)
4. [HypoRiPPatlas] HypoRiPPAtlas as an Atlas of hypothetical natural products for mass spectrometry database search (Nature Communications, 2023)

For BSc and MSc (more classical methods)

5. [ClusterFinder] Insights into Secondary Metabolism from a Global Analysis of Prokaryotic Biosynthetic Gene Clusters (Cell, 2014)
6. [RODEO] A new genome-mining tool redefines the lasso peptide biosynthetic landscape (Nature Chemical Biology, 2017)
7. [RRE-Finder] RRE-Finder: a Genome-Mining Tool for Class-Independent RiPP Discovery (mSystems, 2019)
8. [decRiPPter] Expansion of RiPP biosynthetic space through integration of pan-genomics and machine learning uncovers a novel class of lanthipeptides (PLOS Biology, 2020)
9. [CO-ED] Co-occurrence of enzyme domains guides the discovery of an oxazolone synthetase (Nature Chemical Biology, 2021)
10. [GECCO] Accurate de novo identification of biosynthetic gene clusters with GECCO (bioRxiv, 2021)

For MSc only

11. [antiSMASH] antiSMASH: Rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters (Nucleic Acids Research, 2011) (+ consider follow-up papers)
12. [PRISM] Comprehensive prediction of secondary metabolite structure and biological activity from microbial genome sequences (Nature Communications, 2020) (+ consider the previous papers)
13. [SemPI] SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases (Metabolites, 2021) (+ consider the first paper)
14. [ARTS] ARTS 2.0: feature updates and expansion of the Antibiotic Resistant Target Seeker for comparative genome mining (Nucleic Acids Research, 2020) (+ consider the first paper)
15. [BAGEL] BAGEL4: a user-friendly web server to thoroughly mine RiPPs and bacteriocins (Nucleic Acids Research, 2018) (+ consider the previous papers)

For BSc only

16. [eSNaPD] eSNaPD: a versatile, web-based bioinformatics platform for surveying and mining natural product biosynthetic diversity from metagenomes (Chemistry & Biology, 2014)

Important dates

Kick-off meeting (introduction to the field by A.G., general questions, paper assignment): week 6-8.5.2024. 14-17.5.2024.  The exact date/time will be voted among the registered participants. UPD: 15.07.2024 at 12:00. The event will be online via MS Teams; the recording will be available to the registered participants.

Deadline for paper selection: 23:59 on 22.05.2024 (the kick-off date + 1 week)

Summary submission deadline: 23:59 on 30.06.2024 (for optional feedback, send it two weeks ahead)

Deadline for feedback on your slides (optional): 2 weeks before the presentations

Presentations (i.e., the seminar day/s): TBA (August/September). The event will be in person in E2.1 (CBI), R1.06 (the seminar room). In exceptional circumstances, it will be possible to join the seminar online -- contact the tutor in advance.

Peer reviews: 23:59, two days after the presentation. Everyone will be assigned to one particular student from the same presentation day; the assignment will be published two days before the presentation.


