ShareSeq - NGS protocol enabling realization of low to mid scale projects at reasonable costs using Illumina sequencers

Use only the sequencing capacity you really need

ShareSeq workflow is based on combining samples originating from various investigators in a single sequencing run, thereby lowering costs of individual projects significantly and at the same time maintaining all benefits of Next-Generation Sequencing: high capacity (customizable and based on allocation set by the investigator), standardized protocols of library preparation, scalability, standard data formats etc.

High-throughput sequencers typically offer very high run capacities and generate large amounts of data. This is inevitably associated with high costs and every sequencing project needs to be considered and planned carefully. It is no exception that an investigator can afford only a single Next-Generation sequencing (NGS) experiment to test a hypothesis and a failure of such experiment for whatever reason may have unpleasant consequences.

ShareSeq has been designed to overcome precisely these hurdles and enable realization of low to mid scale and cost-limited NGS projects.

For ShareSeq analysis DNA or RNA samples prepared according to our standard sample submission guidelines are suitable. Samples passing the quality control are labeled by unique identifiers (barcodes). Prepared DNA libraries are quantified and pooled together in ratios reflecting customer allocations and together with internal quality control sequenced. Final sequence reads are sorted based on barcodes, analyzed and made available to investigators.


Results and data analysis

ShareSeq workflow includes a compulsory ShareSeq Basic primary data analysis. The raw data are converted to bases, each base having its quality score (Phred-score), the raw reads containing low quality bases are filtered out to obtain high quality data set. Adapters or their residues are also identified and removed away. The individual reads are sorted based on barcodes into specific groups.

ShareSeq Basic data analysis pipeline includes:

  • The primary raw data are converted to nucleobases - Basecalling
  • Raw reads from NGS sequencing have predicted error probabilities for each base indicated by quality (Q) scores. In many applications it is important to filter reads to reduce the number of errors - Error Correction
  • Every read contains the adapter sequence. It is necessary to identify and clip the adapter sequence before data analysis – Adapter Clipping
  • The specific barcode sequences are added during library preparation to each sample and they can be distinguished and sorted after sequencing - Barcode sorting
  • The sequence quality is not consistent, either within a read or between reads generated in the same sequencing run and, thus, data analysis may be compromised as a result of low quality data. Quality trimming generally entails some iterative removing of bases from one or both ends of a sequence read with lower quality. The primary goal is obviuosly to ensure that the resultant reads are of high quality.
  • The last step is Quality assessment when the reads are statistically evaluated to get the information about quality of bases, read length, GC content and other explanatory information.

Sequencing results are provided in the form of unaligned BAM (UBAM) or FASTQ files. The mean accuracy of filtered data usually varies over 99% (Phred score ≥20). Typical read length distribution after default quality trimming and filtering is shown on the right.

Within ShareSeq workflow you can also opt for ShareSeq Advanced data analysis pipelines (fixed price per sample). The available options are:

  • The genome assembly built de novo. De novo sequencing is used to sequence unknown genomes where there is no reference sequence available. We provide assembly and scaffolding of contigs to build the final consensus sequence.
  • Are you interested in resequencing and variant calling? We will prepare structured output with all polymorphisms between reads obtained and reference sequence you provide.
Data analysis results graph

We know that requirements of each NGS project regarding data analysis are different and not always are the outcomes of ShareSeq Basic or ShareSeq Advanced pipelines sufficient. Therefore, we offer a project-oriented custom data analysis service – ShareSeq Custom. Whether you do not have enough experience in certain areas of data analysis or look for a special bioinformatics pipeline helping to solve your scientific questions we offer our experience and knowledge. Our bioinformaticians can help you with any type of data analysis. Individual consultation is needed in such a case, please submit your inquiry during the ShareSeq ordering process, we will get in touch with you shortly. Analysis of your samples will not be delayed.

© SEQme s.r.o., 2012 - 2019. All rights reserved. Disclaimer.
webdesign Beneš & Michl