The majority of samples studied by massive parallel techniques such as microarrays or next-generation sequencing are heterogeneous at the cellular level. This is especially true for tumours, where different subpopulations of cancer cells are mixed with stromal cells. Bulk tissue analysis using the conventional approach averages cellular expression data, which can lead to observation of an unrealistic combination of expressed transcripts and hide the expression of lowly expressed, but important genes. These facts limit research discoveries, mask the biological processes and may lead to clinical issues during diagnosis of cancer patients. The main goal of the project is to improve patient classification using information about statistically independent transcriptional signals found in an ensemble of cell subpopulations. To deconvolute the bulk transcriptome into such signals, the independent component analysis method will be used for gene expression and exon junction expression data. By considering junctions, we will account for variability at the gene isoform level and more specifically, target the distinct cell subtypes. Thus, independent component analysis will serve as a feature selection method for the following classification of the samples. Several classifiers will be considered in the project; the best performing one will be trained on publicly available data and validated on an independent dataset. Brain gliomas of different stages and two non-small-cell lung cancers were chosen as prototypical examples of heterogeneous cancers. By in silico deconvolution of the bulk samples we will not only improve the sensitivity of classifiers to lowly abundant cell populations, but also provide new biological knowledge about processes taking place in the distinct tumour and stroma cell subtypes. We will identify key regulators responsible for these processes – this information can later be used to define proper therapeutic targets.