Job Recruitment Website - Zhaopincom - Construction method of biological information flow

Construction method of biological information flow

Let's first look at a recruitment message:

Regarding biological information flow, different classification standards may get different classification results, such as:

In the review of bioinformatics pipeline framework, modern bioinformatics process frameworks are classified according to implicit convention framework, explicit framework, configuration framework and class-based framework.

According to script language flow, general workflow language flow, Makefile flow, configuration file flow, Jupyter notebook flow and R markdown flow, several schools of credit analysis flow are divided into different schools.

In my opinion, biological information flow can be divided into two categories: old methods and new methods (nonsense ~ ~ ~), so let's understand them separately.

The traditional method is also the most commonly used process construction method at present, especially in industry.

Disadvantages:

The latest popular process tool, but actually it is not popular in the industry.

CWL (Universal Workflow Language) and WDL (Workflow Description Language) workflow description languages. Define the input and output of each calculation process (script), and then connect these inputs and outputs to form a data analysis process.

It can be executed on multiple platforms, such as local server, SGE cluster, cloud computing platform and so on. , and can be written and executed in multiple places at once. The most famous versions of Linux are snakemake, nextflow, bpipe, etc. Graphical interface versions such as Galaxy, the cloud platform of some commercial companies (just drag the icon).

Cromwell is a workflow management engine developed by Broad Institute, which supports two workflow description languages, WDL and CWL.

Syntax example of CWL snake make:

Reference blog post:

/p/8e57fd2b8 1b2

WDL syntax structure:

Example:

Please refer to the blog post: https://wenlongshen.github.io/2018/09/15/piping-solution-2/

Docker is not a process method, but a packaged container tool, so this classification is far-fetched, just an extension of the above two. We make our own development process a Docker mirror image, which is convenient to use and share.

Take MACS2, a peak call tool commonly used in ChIP-seq analysis, as an example.

For details, please refer to the blog post:

https://wenlongshen . github . io/20 18/09/08/piping-Solution- 1/