Pubblicazioni
Elenco delle pubblicazioni delle U.O. della rete RNBIO del 2008
Newsletter RNBIOGennaio 2009, Anno 1, Numero 1 |
Le pubblicazioni
Despite decades of investigations, it is not yet clear whether there are rules dictating the specificity of the interaction between amino acids and nucleotide bases. This issue was addressed by determining, in a dataset consisting of 100 high-resolution protein-DNA structures, the frequency and energy of interaction between each amino acid and base, and the energetics of water-mediated interactions. The analysis was carried out using HINT, a non-Newtonian force field encoding both enthalpic and entropic contributions, and Rank, a geometry-based tool for evaluating hydrogen bond interactions. A frequency- and energy-based preferential interaction of Arg and Lys with G, Asp and Glu with C, and Asn and Gln with A was found. Not only favorable, but also unfavorable contacts were found to be conserved. Water-mediated interactions strongly increase the probability of Thr-A, Lys-A, and Lys-C contacts. The frequency, interaction energy, and water enhancement factors associated with each amino acidbase pair were used to predict the base triplet recognized by the helix motif in 45 zinc fingers, which represents an ideal case study for the analysis of one-to-one amino acid-base pair contacts. The model correctly predicted 70.4% of 135 amino acid-base pairs, and, by weighting the energetic relevance of each amino acid-base pair to the overall recognition energy, it yielded a prediction rate of 89.7%. (c) 2008 Wiley Periodicals, Inc.
BACKGROUND: Microarray experiments enable simultaneous measurement of the expression levels of virtually all transcripts present in cells, thereby providing a 'molecular picture' of the cell state. On the other hand, the genomic responses to a pharmacological or hormonal stimulus are dynamic molecular processes, where time influences gene activity and expression. The potential use of the statistical analysis of microarray data in time series has not been fully exploited so far, due to the fact that only few methods are available which take into proper account temporal relationships between samples. RESULTS: We compared here four different methods to analyze data derived from a time course mRNA expression profiling experiment which consisted in the study of the effects of estrogen on hormone-responsive human breast cancer cells. Gene expression was monitored with the innovative Illumina BeadArray platform, which includes an average of 30-40 replicates for each probe sequence randomly distributed on the chip surface. We present and discuss the results obtained by applying to these datasets different statistical methods for serial gene expression analysis. The influence of the normalization algorithm applied on data and of different parameter or threshold choices for the selection of differentially expressed transcripts has also been evaluated. In most cases, the selection was found fairly robust with respect to changes in parameters and type of normalization. We then identified which genes showed an expression profile significantly affected by the hormonal treatment over time. The final list of differentially expressed genes underwent cluster analysis of functional type, to identify groups of genes with similar regulation dynamics. CONCLUSIONS: Several methods for processing time series gene expression data are presented, including evaluation of benefits and drawbacks of the different methods applied. The resulting protocol for data analysis was applied to characterization of the gene expression changes induced by estrogen in human breast cancer ZR-75.1 cells over an entire cell cycle.
The regulation of gene transcription requires posttranslational modifications of histones that, in concert with chromatin remodeling factors, shape the structure of chromatin. It is currently under intense investigation how this structure is modulated, in particular in the context of proliferation and differentiation. Compelling evidence suggests that the transcription factor NF-Y acts as a master regulator of cell cycle progression, activating the transcription of many cell cycle regulatory genes. However, the underlying molecular mechanisms are not yet completely understood. Here we show that NF-Y exerts its effect on transcription through the modulation of the histone “code”. NF-Y colocalizes with nascent RNA, while RNA polymerase II is I phosphorylated on serine 2 of the YSPTSPS repeats within its carboxyterminal domain and histones are carrying modifications that represent activation signals of gene expression (H3K9ac and PAN-H4ac). Comparing postmitotic muscle tissue from normal mice and proliferating muscles from mdx mice, we demonstrate by chromatin immunoprecipitation (ChIP) that NF-Y DNA binding activity correlates with the accumulation of acetylated histones H3 and H4 on promoters of key cell cycle regulatory genes, and with their active transcription. Accordingly, p300 is recruited onto the chromatin of NF-Y target genes in a NF-Ydependent manner, as demonstrated by Re-ChIP. Conversely, the loss of NF-Y binding correlates with a decrease of acetylated histones, the recruitment of HDAC1, and a repressed heterochromatic state with enrichment of histones carrying modifications known to mediate silencing of gene expression (H3K9me3, H3K27me2 and H4K20me3). As a consequence, NF-Y target genes are downregulated in this context. In conclusion, our data indicate a role of NF-Y in modulating the structure and transcriptional competence of chromatin in vivo and support a model in which NF-Y-dependent histone “code” changes contribute to the proper discrimination between proliferating and postmitotic cells in vivo and in vitro.
It has been proposed that in cancer, where the bulk of the genome becomes hypomethylated, there is an increase in transcriptional noise that might lead to the generation of antisense transcripts that could affect the function of key oncosuppressor genes, ultimately leading to malignant transformation. Here, we describe the computational identification of a melanoma- enriched antisense transcript, TRPM2-AS, mapped within the locus of TRPM2, an ion channel capable of mediating susceptibility to cell death. Analysis of the TRPM2-AS genomic region indicated the presence in the same region of ano- ther tumor-enriched TRPM2 transcript, TRPM2-TE, located across a CpG island shared with TRPM2-AS. Quantitative PCR experiments confirmed that TRPM2-AS and TRPM2-TE transcripts were up-regulated in melanoma, and their activation was consistent with the methylation status of the shared CpG island. Functional knock-out of TRPM2-TE, as well as over-expression of wild-type TRPM2, increased melanoma susceptibility to apoptosis and necrosis. Finally, expression analysis in other cancer types indicated that TRPM2-AS and TRPM2-TE over-expression might have an even wider role than anticipated, reinforcing the relevance of our computational approach in identifying new potential therapeutic targets.
In considering key events of genomic disorders in the development and progression of cancer, the correlation between genomic instability and carcinogenesis is currently under investigation. In this work, we propose an inductive logic programming approach to the problem of modeling evolution patterns for breast cancer. Using this approach, it is possible to extract fingerprints of stages of the disease that can be used in order to develop and deliver the most adequate therapies to patients. Furthermore, such a model can help physicians and biologists in the elucidation of molecular dynamics underlying the aberrationswaterfall model behind carcinogenesis. By showing results obtained on a real-world dataset, we try to give some hints about further approach to the knowledge-driven validations of such hypotheses.
IGHV3-21–using chronic lymphocytic leukemia (CLL) is a distinct entity with restricted immunoglobulin gene features and poor prognosis and is more frequently encountered in Northern than Southern Europe. To further investigate this subset and its geographic distribution in the context of a country (Italy) with both continental and Mediterranean areas, 37 IGHV3-21 CLLs were collected out of 1076 cases enrolled by different institutions from Northern or Central Southern Italy. Of the 37 cases, 18 were identified as homologous (hom)HCDR3–IGHV3-21 CLLs and were found almost exclusively (16 of 18) in Northern Italy; in contrast, 19 nonhomHCDR3–IGHV3-21 cases were evenly distributed throughout Italy. Clinically, poor survivals were documented for IGHV3-21 CLLs as well as for subgroups of mutated and homHCDR3–IGHV3-21 CLLs. Negative prognosticators CD38, ZAP-70, CD49d, and CD79b were expressed at higher levels in homHCDR3 than nonhomHCDR3–IGHV3-21 cases. Differential gene expression profiling (GEP) of 13 IGHV3-21 versus 52 non– IGHV3-21 CLLs identified, among 122 best-correlated genes, TGFB2 and VIPR1 as down- and up-regulated in IGHV3-21 CLL cases, respectively. Moreover, GEP of 7 homHCDR3 versus 6 nonhomHCDR3–IGHV3-21 CLLs yielded 20 differentially expressed genes, with WNT-16 being that expressed at the highest levels in homHCDR3–IGHV3-21 CLLs. Altogether, IGHV3-21 CLLs, including those with homHCDR3, had a peculiar global phenotype in part explaining their worse clinical outcome.
Multiple sequence alignments are successfully applied in many studies for under- standing the structural and functional relations among single nucleic acids and protein sequences as well as whole families. Because of the rapid growth of sequence databases, multiple sequence alignments can often be very large and difficult to visualize and analyze. We offer a new service aimed to visualize and analyze the multiple alignments obtained with different external algorithms, with new features useful for the comparison of the aligned sequences as well as for the creation of a final image of the alignment. The service is named FASMA and is available at http://bioinformatica.isa.cnr.it/FASMA/.
The center of mass of a protein is an artificial point useful for detecting important and simple features of proteins structure, shape and association. CALCOM is a software which calculates the center of mass of a protein, starting from PDB protein structure files. In the case of protein complexes and of protein- small ligand complexes, the position of protein residues or of ligand atoms respect to each protein subunit can be evaluated, as well as the distance among the center of mass of the protein subunits, in order to compare different conformations and evaluate the relative motion of subunits.
Transglutaminase is an enzyme able to play more than one enzymatic action, acting on a variety of different substrates. The growth of knowledge about the members of the enzyme transglutaminase's family and its substrates since the last 50 years indicates a large interest and curiosity about this protein, whose function and structure was, but still is, an important object of research. On the other hand, the involvement in a number of human diseases together with the lack of knowledge about the biological functions played by some of the most studied members of this family, make this enzyme a fascinating field of study. The history of this enzyme and its substrates, whose cross-linking action was reported for the first time 50 years ago, suggests that an effort to increase knowledge and communication among researchers is required. To achieve this important result, 10 years ago an internet web page called worldwide happening around transglutaminase (WHAT) was created. Driven by these experiences, novel points-of-view to look at Transglutaminase and its substrates may be identified.
Simulation of the correct side chain conformations of amino acid residues is an intriguing issue not only for computational biology, but also for practical outcomes in biotechnology and medicine. This is also a main challenge for molecular simulations, since even in the homology modelling strategy (which uses templates to predict the unknown structure of a protein), the conformation of a side chain generally cannot be easily deduced from the structure of the template. It is important also for applications such as molecular design and docking. Moreover, the correct simulation of the effects of amino acid mutations oc8 curring in many genetic diseases may help in understanding the molecular mechanisms that underlie those pathologies. This review aims to summarize the different strategies developed to predict side chain conformations in proteins. In the current review, differences in approaches are discussed and problems are analyzed, as well as parameters and criteria to compare and evaluate performances of the programs.
Genomic DNAcopy number aberrations are frequent in solid tumours although their underlying causes of chromosomal instability in tumours remain obscure. In this paper we show how Artificial Immune System (AIS) paradigm can be successfully employed in the elucidation of biological dynamics of cancerous processes using a novel fuzzy rule induction system for data mining (IFRAIS) 1 of aCGH data. Competitive results have been obtained using IFRAIS. A biological interpretation of the results carried out using Gene Ontology is currently under investigation.
We identified genomic and network properties of approximately 600 genes mutated in different cancer types. These genes tend not to duplicate but, unlike most human singletons, they encode central hubs of highly interconnected modules within the protein-protein interaction network (PIN). We find that cancer genes are fragile components of the human gene repertoire, sensitive to dosage modification. Furthermore, other nodes of the human PIN with similar properties are rare and probably enriched in candidate cancer genes.
Web Services and Workflow Management Systems can support creation and deployment of network systems, able to automate data analysis and retrieval processes in biomedical research. Web Services have been implemented at bioinformatics centres and workflow systems have been proposed for biological data analysis. New databanks are often developed by taking into account these technologies, but many existing databases do not allow a programmatic access. Only a fraction of available databanks can thus be queried through programmatic interfaces. SRS is a well know indexing and search engine for biomedical databanks offering public access to many databanks and analysis tools. Unfortunately, these data are not easily and efficiently accessible through Web Services. RESULTS: We have developed 'SRS by WS' (SWS), a tool that makes information available in SRS sites accessible through Web Services. Information on known sites is maintained in a database, srsdb. SWS consists in a suite of WS that can query both srsdb, for information on sites and databases, and SRS sites. SWS returns results in a text-only format and can be accessed through a WSDL compliant client. SWS enables interoperability between workflow systems and SRS implementations, by also managing access to alternative sites, in order to cope with network and maintenance problems, and selecting the most up-to-date among available systems. CONCLUSIONS: Development and implementation of Web Services, allowing to make a programmatic access to an exhaustive set of biomedical databases can significantly improve automation of in-silico analysis. SWS supports this activity by making biological databanks that are managed in public SRS sites available through a programmatic interface.
Data integration is needed in order to cope with the huge amounts of biological information now available and to perform data mining effectively. Current data integration systems have strict limitations, mainly due to the number of resources, their size and frequency of updates, their heterogeneity and distribution on the Internet. Integration must therefore be achieved by accessing network services through flexible and extensible data integration and analysis network tools. EXtensible Markup Language (XML), Web Services and Workflow Management Systems (WMS) can support the creation and deployment of such systems. Many XML languages and Web Services for bioinformatics have already been designed and implemented and some WMS have been proposed. In this article, we review a methodology for data integration in biomedical research that is based on these technologies. We also briefly describe some of the available WMS and discuss the current limitations of this methodology and the ways in which they can be overcome.
Identification of prognosticators for Binet A chronic lymphocytic leukemia is important for selecting patients with dismal prognosis. We analyzed CD49d expression in 140 consecutive Binet A chronic lymphocytic leukemia. At diagnosis, CD49d 30% (54/140, 38.6%) associated with proliferation markers, namely CD38 30% (p=3.9x10–6), LDH (p=0.007) and β2-microglobulin (p=0.020). Univariate log-rank analysis identified CD49d 30% as a risk factor of treatment free survival (p=8.3x10–5), time to progression to a more advanced stage (p=4.7x10–4), and time to lymphocyte doubling (p=0.009). Multivariate analysis selected CD49d 30% as an independent treatment free survival predictor after adjustment for biological (HR 2.28; 95% CI 1.71–4.45, p=0.015) and both biological and clinical variables analyzed together (HR 3.33, 95% CI 1.61–6.90, p=0.001). Within Binet A subgroups harboring favorable biological variables (IGHV homology <98%, favorable karyotype, CD38 <30%, ZAP70 <20%) or clinical variables, CD49d 30% consistently identified a subset of patients with short treatment free survival. Our observations indicate CD49d 30% as a new marker for the initial prognostic assessment of Binet A chronic lymphocytic leukemia.
Comprehensive analysis of the gene expression profiles associated with human monocyte-to-macrophage differentiation and polarization toward M1 or M2 phenotypes led to the following main results: 1) M-CSF-driven monocyte-to-macrophage differentiation is associated with activation of cell cycle genes, sub9 stantiating the underestimated proliferation potential of monocytes. 2) M-CSF leads to expression of a substantial part of the M2 transcriptome, suggesting that under homeostatic conditions a default shift toward M2 occurs. 3) Modulation of genes involved in metabolic activities is a prominent feature of macrophage differentiation and polarization. 4) Lipid metabolism is a main category of modulated transcripts, with expected up-regulation of cyclo-oxygenase 2 in M1 cells and unexpected cyclo-oxygenase 1 up-regulation in M2 cells. 5) Each step is characterized by a different repertoire of G protein-coupled receptors, with five nucleotide receptors as novel M2- associated genes. 6) The chemokinome of polarized macrophages is profoundly diverse and new differentially expressed chemokines are reported. Thus, transcriptome profiling reveals novel molecules and signatures associated with human monocyte-to-macrophage differentiation and polarized activation which may represent candidate targets in pathophysiology.
Mantovani A, Sica A, et al. Macrophage polarization comes of age. Immunity. 2005. 23: 344 - 346
Functional polarization of macrophages into M1 or M2 cells is an operationally useful, simplified conceptual framework describing the plasticity of mononuclear phagocytes. Genetic approaches have begun to shed new light on mechanisms underlying macrophage polarization and on the actual in vivo significance of polarized M2 cells.
CD49d/4-integrin is variably expressed in chronic lymphocytic leukemia (CLL). We evaluated its relevance as independent prognosticator for overall survival and time to treatment (TTT) in a series of 303 (232 for TTT) CLLs, in comparison with other biologic or clinical prognosticators (CD38, ZAP-70, immunoglobulin variable heavy chain (IGHV) gene status, cytogenetic abnormalities, soluble CD23, β2-microglobulin, Rai staging). Flow cytometric detection of CD49d was stable and reproducible, and the chosen cut-off (30% CLL cells) easily discriminated CD49dlow from CD4- 9dhigh cases. CD49d, whose expression was strongly associated with that of CD38 (P < .001) and ZAP-70 (P < .001), or with IGHV mutations (P < .001), was independent prognosticator for overall survival along with IGHV mutational status (CD49d hazard ratio, HRCD49d = 3.52, P = .02; HRIGHV = 6.53, P < .001) or, if this parameter was omitted, with ZAP-70 (HRCD49d = 3.72, P = .002; HRZAP-70 = 3.32, P = .009). CD49d was also a prognosticator for TTT (HR = 1.74, P = .007) and refined the impact of all the other factors. Notably, a CD49dhigh phenotype, although not changing the outcome of good prognosis (ZAP-70low, mutated IGHV) CLL, was necessary to correctly prognosticate the shorter TTT of ZAP-70high (HR = 3.12; P = .023) or unmutated IGHV (HR = 2.95; P = .002) cases. These findings support the introduction of CD49d detection in routine prognostic assessment of CLL patients, and suggest both pathogenetic and therapeutic implications for CD49d expression in CLL.
Rete Nazionale di Bioinformatica OncologicaSito Web: http://www.rnbio.it/Per informazioni: info@rnbio.it - Mailing list annunci: news@rnbio.it |

