Malignant lymphoma is a heterogeneous disease with different clinical features. Among diffuse large B-cell lymphomas (DLBCLs), a unique subtype has been identified recently based on cell surface marker CD5 and clinicopathological features. These de novo CD5+ DLBCLs account for ∼10% of all of the DLBCLs and have poorer prognosis. To additionally understand this subtype of DLBCLs at the molecular level and to find genes that are differentially expressed in de novo CD5+ DLBCLs, CD5− DLBCLs, and mantle cell lymphomas, which also have poor prognosis, we performed gene expression profiling using cDNA microarray technology. Data from a total of 9 samples of CD5− DLBCLs, 11 samples of de novo CD5+ DLBCLs, and 10 samples of mantle cell lymphomas were acquired. A series of genes were identified that distinguish these three types of lymphomas. Among DLBCL cases, integrin β1 and/or CD36 adhesion molecules were overexpressed in most cases of CD5+ DLBCL. An immunohistochemical confirmation study revealed that integrin β1 was expressed on lymphoma cells, which may account for the high extranodal involvement and poor prognosis of CD5+ DLBCLs. In contrast, CD36 was overexpressed on vascular endothelia in CD5+ DLBCLs, although there was no difference in vascularity detected by von Wilbrand factor antibody between CD5+ and CD5− DLBCLs. Those results suggest that CD5+ and CD5− DLBCLs have different gene expression signatures in both tumor cells and their vascular systems.
DLBCL 4 is the most common subtype of B-cell lymphoma and is heterogeneous in the clinical response to current therapy and in survival time. DLBCLs 4 are also immunophenotypically heterogeneous; ∼10% of DLBCLs express CD5 antigen (1, 2, 3) . CD5 antigen is originally considered a T-cell marker but is also found in a subset of B-cells. Among B-cell malignancies, CD5 is mainly expressed in CLL and MCL (4) . It has been shown that the tumor cells from patients with Richter’s syndrome resulting from aggressive transformation of CLL express CD5 antigen (5) . However, most patients with CD5+ DLBCL have no history of lymphoproliferative disease including CLL. Thus, this type of DLBCL is considered to arise “de novo” (5) . We and others reported previously that de novo CD5+ DLBCLs have distinct phenotypic, genotypic, and clinical features (1, 2, 3 , 5 , 6) . For example, the de novo CD5+ DLBCLs are immunohistochemically negative for both CD23 and cyclin D1, whereas CLLs are positive for CD23, and MCLs are positive for cyclin D1 (1, 2, 3, 4, 5) . The de novo CD5+ DLBCLs express immunoglobulin heavy chain variable region VH genes with somatic mutations that are different from CLLs and MCLs, indicating different cell origins for CD5+ DLBCLs and CLLs/MCLs (1, 2, 3) . In addition, the prognosis of patients with de novo CD5+ DLBCL is poorer than that of patients with CD5− DLBCL (6 , 7) . These findings demonstrate that de novo CD5+ DLBCLs constitute a distinct subgroup of DLBCLs of poor prognosis.
A recent study showed that gene expression profiling of DLBCLs identified two distinct types of germinal center B-like DLBCL and activated B-like DLBCL (8) . In the present study, cDNA microarray analysis was performed on 30 RNA samples of CD5− and CD5+ DLBCLs and MCLs (cyclin D1+) to identify gene expression activities that are associated with these three groups. Using a robust gene identification approach based on ς-classifier design (9) , we identified genes that separate CD5+ from CD5− DLBCLs, as well as genes that separate MCLs from DLBCLs. Among the top genes that separate CD5+ and CD5− DLBCLs are integrin β1 and CD36, which are expressed in tumor cells and vascular endothelia of CD5+ DLBCL cases, respectively.
MATERIALS AND METHODS
Clinical samples were obtained from 10 patients with CD5− DLBCL, 11 patients with de novo CD5+ DLBCL, and 10 patients with MCL (Table 1) ⇓ . The diagnoses were made according to the WHO Classification of Tumors of Hematopoietic and Lymphoid Tissues (10) . One patient with CD5− DLBCL was excluded, because this patient was diagnosed with mediastinal large B-cell lymphoma, which was already established as a distinct subtype of DLBCL (10) .
DNA microarray studies using specimens of patients with hematopoietic malignancies were approved by Institutional Review Committee in Mie University School of Medicine.
A total of 2142 known human cDNAs were prepared by PCR from the Research Genetics cDNA clone library, purified using MultiScreen PCR plates (Millipore Corp., Bedford, MA) and verified by sequencing at Cancer Genomics Core Lab (M. D. Anderson Cancer Center) before printing (11) . The DNA clones, in 384-well plates, were spotted in replicate onto poly-l-lysine-coated microscope slides using an arrayer (Genomic Solutions, Ann Arbor, MI).
RNA Amplification and Target Labeling.
Lymphoma tissues were ground to powder under frozen conditions and lysed in the lysis buffer TRI reagent (MRC, Cincinnati, OH). Cell pellets of six hematopoietic cell lines (K562, HL60, NB4, BV173, KBM7, and Jurkat) were lysed in the TRI reagent. Control RNA was prepared by mixing the same amount of total RNA extracted from these six cell lines. The labeling reaction was performed as described previously (12 , 13) .
Microarray Hybridization and Image Scanning.
To hybridize the slides, purified and labeled cDNA targets were dissolved in 130 μl of total volume of ExpressHyb solution (Clontech, Palo Alto, CA) containing 8 μg of polydA40–60 (Amersham Pharmacia), 2 μg of yeast tRNA (Life Technologies, Inc.), and 10 μg of human Cot I DNA (Life Technologies, Inc.). The mixture was heated to 95°C for 10 min then applied to the slides and covered by a coverslip. Hybridization was carried out at 60°C for 14–16 h in a moisturized box. Slides were washed at 37°C in 1× SSC (3 m sodium chloride and 0.3 m sodium citrate), 0.01% SDS, 0.2× SSC, 0.01% SDS, and twice in 0.1× SSC sequentially for 2 min each. Hybridized arrays were scanned at 10 μm resolution on a GeneTAC LS-IV scanner (Genomic Solution), and the obtained signal intensities were quantified with ArrayVision (Imaging Research Inc., St. Catherines, Ontario, Canada).
Assessment of Replicability of the Data.
As already mentioned above, each gene is duplicated on the array. Thus, the variation between the replicate spots can be used to assess the reliability of the measurement of that gene expression. Informally, if the two replicate measurements are close to each other, then the estimate of that gene expression can be obtained by a combination (e.g., average) of the two replicates. If, on the other hand, the two replicates are quite different, then that gene should be flagged as unreliable, and the measurements should not be used in subsequent analysis. We use the following simple method to flag unreliable measurements. For each array (patient), we compute the SD of the absolute values of the differences between the corresponding replicates. Then, any absolute value of the difference that exceeds three times the SD is flagged as being unreliable. Using our procedure, ∼3% of the genes, on the average, are deemed unreliable.
Algorithm for Finding Strong Feature (Gene) Sets.
We desire classifiers that categorize sample tissues based on the expression values of a set of genes. Because the number of samples for clinical studies is often small, we use a simple classifier and a few genes (at most three in this study) to form the classifier. This helps mitigate the likelihood that a classifier that does well on the sample data, but is not good relative to the populations. As for a specific algorithm to design the classifier, we used a recently developed σ-classifier algorithm. σ-classifier is designed from a probability distribution resulting from spreading the mass of the sample points via a circular distribution to make classification more difficult, whereas maintaining sample geometry (9) .
A classifier that has a small error for a large variance is desirable, because its performance is more likely to be robust relative to new data. If a classifier has an extremely small error (≈0) for a small σ but a large error for large σ, then we do not consider it sufficiently strong. Intuitively, this method mitigates overfitting the data, because it favors solutions for which little changes in the data lead to little changes in the classifier.
The second feature of the algorithm is that it searches for gene combinations that separate the classes. This feature is especially attractive in biological settings where heterogeneity is the norm and no single gene can dictate classifications.
All of the computations that search and analyze classifiers in this study were done on a Beowulf-based supercomputer at the Center for Information Technology at NIH. This system is a distributed memory parallel computer consisting of a total of 780 XP/Athlon and Pentium III processors interconnected through a high-speed network.
Histopathology and Immunophenotypic Study.
Lymphoma tissues were fixed in 10% formalin and embedded in paraffin, followed by the staining of sections (5-μm thick) with H&E and Giemsa. The immunophenotypic study of tumor cells used a labeled avidin-biotin method for frozen sections as described previously (14) . The monoclonal antibodies used in the present study were Leu4(CD3), Leu1(CD5), Leu12 (CD19), and CR2 (CD21; Becton Dickinson, Mountain View, CA); CALLA(CD10), L26 (CD20), MHM6(CD23), anti-IgG, anti-IgA, anti-IgM, anti-IgD, anti-κ, anti-λ, and F8/86 (vWF; DAKO, Carpinteria, CA); 4B7R (integrin β1) and SMO (CD36; Santa Cruz Biotechnology, Inc., Santa Cruz, CA). In all of the cases, the tumor cells expressed CD19 and/or CD20, but not CD3. Cyclin D1 (IBL, Gunma, Japan) expression was examined by immunohistochemistry using paraffin sections. More than 20% positivity of the tumor cells was assumed to indicate positivity for the purposes of this study.
Identification of Genes Differentially Expressed in CD5+ and CD5− DLBCLs and MCLs.
Total RNAs were isolated from tumor samples of 11 CD5+ DLBCLs, 9 CD5− DLBCLs, and 10 MCLs, and subjected to microarray analysis. After obtaining the quantified gene expression data for the 30 lymphoma samples, we set out to identify strong feature genes that robustly distinguish the three subtypes of lymphomas using the σ-classifier algorithm (9) . The advantage of this algorithm is that it not only identifies genes that by themselves separate the groups, it also identifies combination of genes with improved discriminating power. This algorithm also generates a conservative classifier gene list that avoids over-fitting of data, thus producing less error for future classification (9) .
Using this algorithm, we identified a number of gene combinations that robustly separate CD5− DLBCL versus de novo CD5+ DLBCL, as well as MCL versus DLBCL. We first examined the genes that were selected repeatedly in the combinations (Tables 2 ⇓ and 3) ⇓ . To have a global evaluation of how well those strong feature genes separate the three groups of lymphomas when they were merged together, we carried out a hierarchical clustering analysis using 25 genes from the MCL versus DLBCL classifications and 42 genes from CD5+ versus CD5− classifications (Fig. 1, A and B) ⇓ . MCLs were clustered correctly (Fig. 1A) ⇓ , and de novo CD5+ and CD5− DLBCL cases were completely separated (Fig. 1B) ⇓ as expected.
A CD5 gene was spotted on the microarray used in the present study, but in most cases, the expression level of CD5 detected by the microarray was weak. Therefore, the detectable expression levels of CD5 did not reflect the immunohistochemical outcome of CD5-positive DLBCL and CD5-negative DLBCL. On the other hand, consistent with the immunohistological staining, cyclin D1 levels were high in most cases of MCL, and cyclin D1 is the top classifier for MCL versus DLBCL (Table 2) ⇓ .
Confirmation of Integrin β1 and CD36 Overexpression in CD5+ DLBCLs by Immunohistochemistry.
Among the genes that separate de novo CD5+ and CD5− DLBCL, six genes, integrin β1, CD36, mRNA for translocation protein-1, monocyte chemotactic protein 3 precursor, UDP-glucose pyrophosphorylase 2, and inhibitor of DNA binding 2 were overexpressed in de novo CD5+ DLBCL (Fig. 1B) ⇓ . We focused on two genes, integrin β1 and CD36, for confirmation studies by immunohistochemistry, because both of these two genes were overexpressed in most cases of CD5+ DLBCL. Furthermore, overexpression of integrin β1 and CD36, adhesion molecules, may account for the aggressive feature of CD5+ DLBCL. Immunohistochemistry assay was selected for confirmation because results of this assay will extend the microarray data to protein level and to cellular localization level. The second benefit is important because microarray measure a population average effect, and tumor tissues are of a mixture of different cell types.
Immunohistochemical staining of integrin β1 showed that integrin β1 was expressed on lymphoma cells of CD5+ DLBCLs (Fig. 2B) ⇓ . However, immunohistochemical staining of CD36 revealed that CD36 was overexpressed on vascular endothelia of CD5+ DLBCLs and not expressed on lymphoma cells (Fig. 2F) ⇓ . In all cases of CD5− nodal DLBCLs tested immunohistochemically, CD36 was expressed neither on vascular endothelia nor on lymphoma cells (Table 4 ⇓ ; Fig. 2D ⇓ ). To evaluate the vascularity of lymphoma tissues, vWF staining was performed simultaneously using close sections from the same lymphoma tissues used in CD36 staining (Fig. 2, C, E, and G) ⇓ . There were no significant differences in vascularity among CD5+ DLBCLs, CD5− DLBCLs, and MCLs (Fig. 2, C, E, and G) ⇓ , suggesting that CD36 overexpression on vascular endothelial cells of CD5+ DLBCLs reflects a different molecular signature in the vessels of CD5+ DLBCLs. Among 8 cases of CD5− extranodal DLBCLs, CD36 was expressed on vascular endothelia in 5 cases, but the expression levels of CD36 in CD5− extranodal DLBCLs were weaker than those in CD5+ DLBCL cases (Table 2) ⇓ . In 5 patients with MCLs including 3 new patients not analyzed by microarray, integrin β1 was weakly expressed in 1 patient, and CD36 was expressed on a few vessels in 3 patients (Fig. 2H) ⇓ .
Gene Combinations as Stronger Classifiers.
Cancers are highly heterogeneous, and multiple molecular changes occur during cancer progression. Even the best markers do not cover all cancers of the same type. In the present study, integrin β1 and CD36 are good markers in a sense that they are overexpressed in most of CD5+ DLBCLs. However, they do not necessarily co-overexpress. As mentioned previously, classifiers constructed from combinations of genes provide more accurate classification than single-gene classifiers. On the basis of the mathematical model used in this study, classification by the two-gene sets of integrin β1 and CD36 for CD5+ DLBCLs and CD5− DLBCLs (Fig. 2I) ⇓ provides a more robust classifier. In other words, the combination of these two molecular events is more reflective of the molecular activities in the CD5+ group of DLBCLs.
In the present study, we used a 4800-feature cDNA microarray including >2000 known cellular genes to analyze 30 different lymphoma samples. One-third of the cases represent a well-recognized subtype of lymphoma, MCL. The present study selected strong feature genes and classified MCLs from DLBCLs correctly. The fact that cyclin D1 is selected as a top strong feature gene validates the reliability of the microarray experiments in the present study and supports the previous reports in the literature (14 , 15) . The other two-thirds are DLBCLs consisting of two recently identified subgroups represented by the surface marker CD5. The de novo CD5+ DLBCL has been reported to be a unique subgroup of DLBCL clinicopathologically, which we expect to have unique gene expression events reflecting their pathophysiological status. Indeed, among the gene expression events associated with CD5+ DLBCLs are the overexpression of some interesting genes. Integrin β1, one of adhesion molecules, plays an important role in B-cell lymphoma adhesion and chemotaxis on fibronectin (16) ; disruption of integrin β1 gene in a lymphoma cell line reduced its metastatic potential (17) . It has been reported that protein expression of β-integrin adhesion molecules in non-Hodgkin’s lymphoma correlated with extranodal involvement (18) and that negative or low expression of β-integrin is associated with favorable prognosis (18) .
A particular case is CD36 antigen, which is a thrombospondin receptor. CD36 is transcriptionally regulated by Oct-2, which is a regulator of B-cell differentiation (19) . CD36 expression in chronic B-cell lymphoproliferative disorders is related to tumor metastasis (20) , and its expression in B-CLLs is reported to be an indicator of tumor cell dissemination (21) . Because of the known information from literature, it was first anticipated that CD36 expresses in lymphoma cells. Our immunohistochemistry staining revealed otherwise. CD36 in the CD5+ DLBCL tissues is expressed in the vascular endothelial cells rather than the lymphoma cells. This finding has offered several insights. First, endothelial cells in different types of cancers have different molecular events. Thus, the “normal” endothelial cells in cancer are part of the pathophysiological system of the disease. This finding supports the use of “unpurified” tumor tissues for genomic and molecular study rather than using “purified” tumor cells using microdissection. The fact that vascular endothelium is an integral part of cancer is also supported by a recent report that endothelial cells of tumor and normal tissues have different gene expression profiles (22) . The second implication is that tumor cells have close communication with endothelial cells residing in the tumor. It has been reported that intravascular or intrasinusoidal infiltration was observed in 19% of de novo CD5+ DLBCLs (7) . It is conceivable that CD36 expressing endothelial cells are susceptible to contribute to intravascular lymphoma, which is occasionally seen in CD5+ DLBCLs.
This hypothesis is supported by our analysis, which showed that combination of integrin β1 and CD36 better associates with CD5+ DLBCL than each of the two genes alone. It is conceivable that the CD5+ lymphoma cells with high integrin β1 expression communicate with the endothelial cells, which express CD36. This notion is additionally supported by a recent report that integrin β1 physically interacts with CD36 (23) . Thus, CD36 may serve as a target for intervention to improve the prognosis of CD5+ DLBCLs. It would also be of special interest to investigate the mechanism through which CD36 is activated in the endothelial cells of CD5+ DLBCLs.
We thank Dr. Richard Ford for his valuable discussion and critical reading of this manuscript.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
↵1 Supported in part by grants (No.13671059, No.12217064, and No.12217062) from Japanese Ministry of Education, Culture, Sports, Science and Technology; Tobacco Settlement Fund to M. D. Anderson Cancer Center as appropriated by the Texas Legislature; and a grant from Kadoorie Foundation to M. D. Anderson Cancer Center.
↵2 To whom requests for reprints should be addressed, at The Second Department of Internal Medicine, Mie University School of Medicine, 2-174, Edobashi, Tsu, 514-8507, Japan. Phone: 81-59-231-5016; Fax: 81-59-231-5200; E-mail: (T. K.) or Cancer Genomics Core Laboratory, Department of Pathology, Box 85, The University of Texas M. D. Anderson Cancer Center, Houston, Texas 77030. Phone: (713) 745-1103; Fax: (713) 792-5549; E-mail: (W. Z.).
↵3 These authors contributed equally to this work.
↵4 The abbreviation used are: DLBCL, diffuse large B-cell lymphoma; CD, cluster of differentiation; CLL, chronic lymphocytic leukemia; MCL, mantle cell lymphoma; EST, expressed sequence tag; vWF, von Wilbrand factor.
- Received August 5, 2002.
- Accepted October 30, 2002.
- ©2003 American Association for Cancer Research.