The short-chain fatty acid butyrate, produced by microbial fermentation of dietary fiber in the large intestine, is a physiological regulator of major pathways of colonic epithelial cell maturation: cell cycle arrest, lineage-specific differentiation, and apoptosis. Microarray analysis of 8,063 sequences demonstrated a complex cascade of reprogramming of SW620 colonic epithelial cells upon treatment with butyrate characterized by the progressive recruitment of gene sets as a function of time. Comparison with the effects of trichostatin A, in conjunction with differences in the kinetics of alteration of histone acetylation induced by butyrate and trichostatin A, identified subsets of induced and repressed genes likely coordinately regulated by altered histone acetylation. The butyrate response was also compared in detail with that of sulindac, a nonsteroidal anti-inflammatory drug with significant chemopreventive activity for colon cancer, and curcumin, a component of mustard and curry structurally and functionally related to sulindac that also has chemopreventive activity. Although gene clusters were identified that showed similar responses to butyrate and sulindac, the data were characterized by the extensive differences in the effects of the two agents. This was striking for functional classes of genes involved in signaling pathways and in cell cycle progression, although butyrate and sulindac induce a similar G0-G1 arrest, elevation of β-catenin-Tcf signaling, and apoptotic cascade. As regards cell cycle arrest, the underlying mechanism in response to butyrate was most similar to that of the Caco-2 cell line that had spontaneously undergone a G0-G1 arrest and least similar to the G2-M arrest stimulated by curcumin. Thus, high-throughput microarray analysis of gene expression profiles can be used to characterize and distinguish the mechanisms of response of colonic epithelial cells to physiological and pharmacological inducers of cell maturation. This has important implications for characterization of chemopreventive agents and recognition of potential toxicity and synergies. The data bases, gene clusters, and analyses are available at http://sequence.aecom.yu.edu/genome/.
SCFAs 3 are produced in the colonic lumen by microbial fermentation of dietary fiber and serve as the principal energy source for colonic epithelial cells, where they are metabolized by mitochondrial β-oxidation (1) . They play a role in homeostasis of the colonic mucosa because they induce pathways of cell maturation, including cell cycle arrest, differentiation, and apoptosis (2, 3, 4) . Although SCFAs will elicit similar effects in a number of cell types in vitro, this is of particular importance for colonic cells because the concentrations of SCFAs, especially butyrate, which can trigger these pathways are readily achieved in vivo in the colonic lumen (5 , 6) .
In fact, there is significant evidence that SCFAs play a physiological role as a key inducer of cell maturation pathways in the colon. In patients who suffer from colitis attributable to diversion of the fecal stream, introduction of SCFAs by enema promotes maturation of the mucosal cells (7) , and in rats, supplementation with SCFAs alleviates the intestinal atrophy that results from total parenteral nutrition (8) .
We and others have studied the cell biology of the response to the SCFA butyrate. There is a temporally ordered response, consisting of an early phase of commitment during which cells arrest in G0-G1 at 12–24 h and a late phase in which an apoptotic cascade is executed (2, 3, 4 ,, 9, 10, 11) . Cells also undergo a differentiation program in response to butyrate, with, for example, elevations in alkaline phosphatase mRNA and activity manifest as markers of differentiation along the absorptive cell lineage (2 , 12) . Our data from cell lines (2 , 9) and from a mouse model with a homozygous deletion of the gene for short chain acyl dehydrogenase (10) suggest that triggering and/or completing maturation pathways requires the mitochondrial β-oxidation of SCFAs.
Thus, significant reprogramming of colonic epithelial cells takes place in response to the SCFA butyrate, but the underlying mechanisms are unclear. Butyrate is an inhibitor of HDAC activity (13, 14, 15) , butyrate-responsive elements have been identified in the promoters of a number of genes (16, 17, 18, 19) , and butyrate up-regulates β-catenin-Tcf signaling (9) . Therefore, altered accessibility of cis acting elements to trans-acting factors by changes in extent of histone acetylation may all contribute to modulation of transcription by butyrate. However, the three maturation pathways (cell cycle arrest, lineage-specific differentiation, and apoptosis) of colonic epithelial cells stimulated by butyrate make it likely that there is a complex coordinated response to this physiological regulator. Array and imaging technology have permitted us to investigate this question.
In the first large-scale applications of array and quantitative imaging methods, we identified patterns of gene expression that characterized the mucosa of patients at genetic risk for development of colon cancer, distinguished benign and malignant tumors from each other and from the normal mucosa, and profiled gene expression in the butyrate-induced maturation of colonic epithelial cells (20 , 21) . These studies led to the discovery of a role of mitochondrial function in risk for, and progression of, colon tumors and eventually to our appreciation for a role of mitochondria in an apoptotic cascade (2, 3, 4 ,, 22 , 23) . Although this early work quantified level of expression of each of >4000 sequences, only relatively abundant sequences could be investigated, and until selected and sequenced, cDNA clones analyzed were anonymous. However, the advent of modern genomic approaches, coupled with microarray technology, permits a more detailed approach to these questions (24, 25, 26, 27, 28, 29) .
Here we identify a cascade of reprogramming of colonic epithelial cells that expands as a function of time of exposure to butyrate. This response has been compared to that induced by TSA, like butyrate, an inhibitor of HDAC (30) , which, in conjunction with data demonstrating the different kinetics of change in histone acetylation stimulated by the two agents, has identified a population of sequences that are likely coordinately regulated by changes in histone acetylation. Comparative analysis has also been done with the response to sulindac, a nonsteroidal anti-inflammatory agent that has significant chemopreventive activity (31, 32, 33, 34, 35) , and curcumin, a dietary chemopreventive agent from mustard and curry that is structurally related to sulindac (36, 37, 38, 39, 40, 41) . Sulindac (9 , 42, 43, 44) and TSA (9 , 45 , 46) induce a G0-G1 cell cycle arrest and an apoptotic response similar to that induced by butyrate, but curcumin induces a G2-M arrest without significant apoptosis (9 , 47) . Similarities in response demonstrate that butyrate and TSA are related in terms of mechanism of action, as are sulindac and curcumin. However, the data are more striking in terms of the marked differences in response among the agents and demonstrate, for example, that signaling pathways recruited into the butyrate and sulindac responses are different, and that the underlying mechanism of cell cycle arrest differs markedly between butyrate and curcumin, consistent with their induction of a G0-G1 and G2-M cell cycle arrest, respectively, but also between butyrate and sulindac, although both induce a G0-G1 arrest. 4
MATERIALS AND METHODS
SW620 colon carcinoma cells were maintained by serial passage in MEM containing 10% FCS as described previously (3 , 4) . For stimulation, cells were treated by addition to the culture medium of either 5 mm butyrate, 1.6 mm sulindac, 1.0μ m TSA, or 25 μm curcumin, all from Sigma Chemical Co. (St. Louis, MO), and as described previously (9) . Control cultures were treated simultaneously with carrier, and for each time point, a control flask was maintained for the same length of time and harvested in parallel. Caco-2 cells were maintained and underwent spontaneous cell cycle arrest and differentiation over a 21-day period, as described previously (48) .
RNA and Probe Preparation.
SW620 cells were grown in T150 flasks and RNA prepared using the RNeasy Midi kit (Qiagen, Valencia, CA). For each microarray hybridization, two separate probes were made: one labeled with Cy3 (control) and one with Cy5 (treated). Aliquots of RNA (100 μg) were precipitated by addition of 2.5 volumes of cold ethanol:0.1 volume 3 m sodium acetate (pH 5.2), harvested by centrifugation, briefly lyophilized, and resuspended in 17 μl of DEPC (diethylpyrocarbonate)-treated dH20. To this was added 2 μl of oligo dT primer (500 ng/μl; Life Technologies, Inc., Rockville, MD), and the mixture was heated at 65°C for 5 min to denature the nucleic acid. The following were then added: 8 μl 5× first-strand buffer(250 mm Tris, 375 mm KCl, and 15 mm MgCl2, pH 8.3), 4 μl of 0.1 m DTT, 4 μl of deoxynucleotide triphosphate (25 mm each dATP, dCTP, and dGTP, 10 mm dTTP; Amersham, Piscataway, NJ), 4μ l of Cy3 or Cy5 dUTP (1 mm; Amersham), 1 μl of RNase inhibitor (30 units/μl; 5Prime,3Prime, Boulder, CO), and 2 μl of Superscript II Reverse Transcriptase (200 units/μl; Life Technologies, Inc.). The reaction was carried out for 1 h at 42°C. Probes were then denatured for 2 min at 95°C, diluted with 43μ l of distilled H2O; 10 μl of 10× reaction buffer were added (100 mm Tris, 50 mm EDTA, and 2m sodium acetate, pH 7.5), and the RNA was hydrolyzed by the addition of 2 μl of RNase ONE (10 units/μl, Promega Corp., Madison, WI) and incubation for 10 min at 37°C.
The probe reactions were then clarified by a 10-min spin at high speed in a microfuge. The supernatants from the two reactions (control and treated, Cy3 and Cy5, respectively) were combined, diluted with 200μ l of distilled H2O, and applied to a Millipore YM-50 column (Millipore, Bedford, MA) to remove unincorporated nucleotides by three 400-μl washes with distilled H2O. The probes were recovered by inverting and spinning the column.
Hybridization and Washing of Microarrays.
Microarray slides produced by the Albert Einstein College of Medicine facility (49) were prepared for hybridization by moistening over boiling water and then immediately cross-linked with 60 mJ of UV light (UV Stratalinker; Stratagene, La Jolla, CA). Slides were again moistened over boiling water, snap-dried on a hot plate, and soaked for 1 h in 0.6 m succinic anhydride, 0.02 m sodium borate (pH 8.0) in 1-methyl-2-pyrrolidinone. After 15–20-s rinses in 0.1% SDS and distilled H2O, they were placed in boiling water for 5 min, followed by an ethanol rinse at 0°C. Slides were then prehybridized under a coverslip for 2 h at 42°C with 50% formamide, 6× SSPE (0.9 m NaCl, 0.012 m NaH2PO4, and 6 mm EDTA, pH 7.4), 2.5× Denhardt’s reagent (0.05% Ficoll, 0.05% polyvinylpyrrolidone, and 0.05% BSA), 0.5% SDS, and 100μ g/μl sheared salmon sperm DNA (Sigma). After prehybridization, the coverslip and excess buffer were removed. The probe volume was adjusted to 6.5 μl, to which was added 7 μl of deionized formamide, 4 μl of 20× SSPE, 1 μl of 50× Denhardt’s reagent, 0.5 μl of SDS, and 1 μl of hybridization blockers (10 μg/μl human Cot-1 DNA, 4μ g/μl yeast tRNA, and 2 μg/μl poly(A); from Life Technologies, Sigma, and Amersham, respectively), and the mixture was applied to the slide, covered with a coverslip, and hybridized at 42°C for 16–18 h. The slide was then washed in 1× SSC (0.015 m sodium citrate, 0.15 m NaCl, pH 7.0), 0.1% SDS long enough to remove the coverslip, and then washed for 5 min at room temperature in 0.2× SSC/0.1% SDS and twice for 5 min each in 0.2× SSC.
Microarrays and Scanning.
The Albert Einstein Microarray printer and laser scanner are custom designed and built instruments that have been described (49) . The robotic printer is modeled after the design described by Brown and colleagues at Stanford University (24 , 26 , 27) . 5 Arrays used in this report consisted of the PCR-amplified products of 8,063 cloned sequences selected from a library of approximately 18,000 sequences received from Genome Systems.
After hybridization and washing, the emission from the Cy3 and Cy5 fluorochromes were recorded in two separate high resolution scans. Excitation for Cy3 and Cy5 was at 532 and 633 nm, respectively, and detection centered on 570–580 nm and 670–680 nm, respectively. The two images were superimposed, and the emission at each wavelength was quantified using Scanalyze 1.41 software (50) . These data were transferred to an Excel spreadsheet, where the signal:background ratio for each channel was calculated as well as the ratio between these ratios (i.e., green signal:background ratio was divided by the red signal:background ratio). The data were then normalized among arrays by expressing this value as a ratio to the average of these values for all 8,063 genes. The data were transformed to a log(2) scale and transferred to Microsoft Access, where a combination of Access, Microsoft Excel, GeneCluster (50) , and TreeView (50) were used for analysis.
For initial analysis, we considered all time points for a given agent as a single experiment. For each agent, data for a gene (i.e., six time points) was included in the data set if the signal:background ratio was >1.25 for either the red or green channel for at least three of the six measurements. When considering comparisons among all agents (i.e., calculation of N-Euclidean distances), the gene had to be included in the data sets for all agents.
Euclidean distances were calculated by the formula: DA1A2 = where D is the Euclidean distance for agents A1 and A2 at any time points, N is each sequence in the database, and R is the red:green ratio from the microarray.
Western Blot Analysis.
For detection of acetylated histone H4, cells were harvested in PBS, washed, and resuspended in 5 volumes of 10 mm HEPES (pH 7.9), 1.5 mm MgCl2, 10 mm KCl, 0.5 mm DTT, and 1.5 mm phenylmethylsulfonyl fluoride, sulfuric acid added to a final concentration of 0.2 m, and samples incubated on ice for 30 min. Samples were centrifuged at 11,000 × g for 10 min at 4°C, and the supernatant fraction containing the acid-soluble proteins dialyzed in a porous membrane (molecular weight cutoff of 3500; Spectrum Laboratories, Rancho Dominquez, CA) against 200 ml of 0.1 m acetic acid, twice for 1 h each, and three times against 200 ml of distilled H2O for 1 h, 3 h, and overnight, respectively. Proteins were resolved on a 15% Tris-glycine SDS gel (Bio-Rad, Richmond, CA) and transferred to a nitrocellulose membrane overnight (Bio-Rad). Blots were blocked in 5% nonfat milk in PBS and incubated with anti-acetylated histone H4 (Upstate Biotechnology, Lake Placid, NY; 1:1000) and an horseradish peroxidase-conjugated antirabbit secondary antibody (1:2000; Amersham) for 1 h each. Antibody binding was detected using the ECL reagent according to the manufacturer’s instructions.
In comparing two different control cultures in which RNA from the first was used as template for preparation of a Cy3 (green)-labeled probe and from the second a Cy5 (red)-labeled probe, and hybridization of the mixed probes to a single array of 8,063 sequences prepared by the Albert Einstein College of Medicine microarray facility (49) , it is apparent that the normalized ratios of expression were tightly grouped around the mean (Fig. 1A ⇓ ). In contrast, as soon as 30 min after butyrate treatment, there was a shift in this pattern, which continued to expand progressively [note log(2) scale] as a function of time over the 48-h time course of butyrate treatment (Fig. 1B ⇓ ). Extensive alterations in gene expression were also detected in response to sulindac (Fig. 1C ⇓ ), with much more limited changes induced by TSA and curcumin (Fig. 1 and E ⇓ ).
The Cluster and TreeView programs of Eisen et al. (50) were used to analyze the data from all experiments to characterize the extensive reprogramming initiated by butyrate and to determine how this compared with that triggered by each of the other agents. The results are shown in Fig. 2 ⇓ , with red representing increased expression, green representing decreased expression, and the magnitude of the change depicted by the intensity of color. Fig. 2 ⇓ illustrates a progressive recruitment of sequences as the cells responded to butyrate. During the first 30 min, a small number of sequences were elevated in expression (Fig. 2A ⇓ ) or were repressed (Fig. 2F ⇓ ) by butyrate. However, after this initial lag, the reprogramming expanded considerably, with a striking number of sequences showing altered expression beginning at 2 h (Fig. 2 and G ⇓ ), and 12 h (Fig. 2 and H ⇓ ). This expansion of the response continued throughout the experiment but began to moderate by 16–24 h (Fig. 2 ⇓ , D, E, I, and J).
Over the 48-h time course analyzed, 256 sequences were elevated in expression by butyrate, whereas 333 sequences were repressed. This represents ∼7% of the sequences assayed. We believe, however, that this is a minimum estimate of the extent of reprogramming that takes place in response to butyrate (see “Discussion”).
Blast searches of each of the sequences altered in expression with butyrate were performed to update the data bases. Of the 589 altered sequences, 345 represent named sequences in the database, whereas the remainder are unnamed or expressed sequence tags. 6
Also shown in these analyses is the extent of altered expression of each sequence during the time course of treatment with the other agents. There were clusters of similar change in response to the agents (e.g., in Fig. 2 ⇓ , clusters 1, 2, 3, and 8), whereas two clusters (nos. 4 and 6) behaved similarly only for butyrate and sulindac. In contrast, clusters 5 and 7 responded similarly to sulindac and curcumin but were markedly different in response to butyrate. In fact, the data are most striking in terms of the differences in response among the agents.
To more fully investigate the relationships among the agents, Treeview was used to produce a dendrogram that reflects the relative extent of relatedness among all of the time points for each of the agents (Fig. 3A ⇓ ). The 30-min time points for all of the agents clustered together, reflecting the fact that at this early time, there had been limited change in response to each agent (Fig. 1 ⇓ ), and hence in each case, the cells still were most like untreated controls. In addition, the time points for each agent tended to cluster, and the butyrate response clustered most closely to that of TSA.
To quantify this analysis and to compare the response in detail of each agent to the physiological inducer butyrate, we calculated the N-dimensional Euclidean distance for each of the butyrate time points to all of the time points for all of the agents (see“ Materials and Methods”). The data were then divided into quartiles of “relatedness” and are graphically represented in Fig. 3B ⇓ .
Response to all of the agents was most related to butyrate affects at early time points, reflecting the fact that alterations in expression proceed as a function of time, as noted above. With increasing time, the similarities diverged as the cells responded to each agent. As reflected in the dendrogram, among the agents, the profiles of gene expression induced by TSA were more like that of butyrate than were those induced by either sulindac or curcumin.
Because butyrate and TSA are both inhibitors of HDAC activity, we hypothesized that the greater similarity between butyrate and TSA may in part be attributable to this shared mechanism of action. We therefore investigated whether we could identify a population of genes for which altered expression by butyrate and TSA was attributable to altered HDAC activity by each.
Inhibition of HDAC activity results in histone hyperacetylation. We therefore measured the degree to which histone H4 was acetylated by western blot analysis (Fig. 4A ⇓ ) and normalized to total histone H1 level for each time point after either butyrate or TSA treatment (Fig. 4B ⇓ ). The kinetics of alteration were markedly different for the two agents. As shown in Fig. 4 and B ⇓ , butyrate induced a gradual increase in histone H4 acetylation that peaked at 16 h and then fell but remained well above baseline through 48 h. In contrast, TSA produced a rapid and short-lived increase in H4 acetylation that peaked at 2 h and then returned to baseline. We then selected all genes from the database that were altered in expression at any time point for butyrate and TSA but not for sulindac or curcumin. Five hundred eighty four sequences were selected based on these criteria, and the profile of altered expression of these genes in Treeview is shown in Fig. 4C ⇓ . There are two gene clusters in this population of sequences (clusters 9 and 10 in Fig. 4C ⇓ ) that were elevated or repressed in expression, respectively, with the same kinetics as the alterations in H4 acetylation induced by butyrate and TSA, i.e., altered expression beginning at 2 h with butyrate that persisted but a sharp spike of alteration with TSA at only 2 h. The tight correspondence between the kinetics of altered histone acetylation and kinetics of altered expression for genes in clusters 9 and 10 strongly argue that changes in HDAC activity, known to be inhibited by both butyrate and TSA, underlie the changes in expression for these genes. The identity of genes in these clusters is listed on the web site.
The overall differences in profile of gene expression, and hence in mechanism of response stimulated by the different agents, were surprising, especially among the responses to butyrate and sulindac, both of which stimulate a similar cell cycle arrest, apoptosis, and a similar extensive reprogramming (Fig. 1 ⇓ ). This was, therefore, investigated more closely. Similar to the analysis presented in Fig. 2 ⇓ for butyrate, the Cluster and Treeview programs were used to display the gene alterations in response to sulindac, a representative of a class of agents, nonsteroidal anti-inflammatory drugs, that have significant chemopreventive activity for colon cancer. Similar to butyrate, there was a progressive recruitment of sequences over the 48 h treatment with sulindac (Fig. 5 ⇓ ). In total, 534 sequences were recruited into the response to sulindac, similar to the 589 recruited by butyrate. However, the overlap between these populations was limited, consisting of 145 sequences (Fig. 6 and B ⇓ ). Even more surprising was that of these 145 sequences, 53 (37%) showed the opposite response to butyrate and sulindac, rising with one of the agents but decreasing with the other (Fig. 6 ⇓ ).
We delved more deeply into the differences between the butyrate and sulindac response by focusing on two subsets of genes from 22 functional gene classes we defined in the database (available on the web site). The class of genes involved in signaling was by far the largest functional class, consisting of 275 members. This class was broadly defined to include both molecules that may interact with components of signal transduction pathways (e.g., E-cadherin) and effector molecules that alter transcription (e.g., fos) as well as signaling components (e.g., kinases) themselves. Within this subclass, 33 genes increased in expression with butyrate (Fig. 7A ⇓ , group 1 signaling genes), and 45 were repressed (Fig. 7C ⇓ , group 2 signaling genes). The 78 genes that encode components of signaling pathways that are altered over the time course of butyrate treatment represent 27% of all named genes that change in the experiment. Thus, extensive alterations in a large number of signaling pathways and their components characterize the butyrate response. However, for the group 1 signaling genes, all of which were induced by butyrate (Fig. 7A ⇓ ), most sequences did not respond to sulindac, and of those that did, the number of genes that showed the same response (induction) and the opposite response (repression) was similar (Fig. 7B ⇓ ). A similar relationship between response to butyrate and sulindac exists for the group II genes, all of which were repressed by butyrate (Fig. 7C ⇓ ) but showed marked heterogeneity in response to sulindac, with most showing no change in expression with sulindac treatment (Fig. 7D ⇓ ). This illustrates that the signaling pathways that responded to butyrate and sulindac overlapped only to a very limited degree.
The second functional class of sequences investigated was a subset of 61 genes in the database involved in regulation of cell cycle progression (e.g., cyclins, cyclin-dependent kinases, RB, myc; see Fig. 8 ⇓ ). By analysis of the Euclidean distances for this gene subset for all of the agents, it was first determined that the time points at which the response was maximal for the cell cycle genes (i.e., maximally different from control) were 48 h for sulindac and curcumin, 16 h for butyrate, and 12 h for TSA (Fig. 4 ⇓ ). We then calculated the Euclidean distances for the cell cycle genes comparing each of TSA, sulindac, and curcumin to butyrate at these time points. The analysis (Fig. 8A ⇓ ) demonstrated that for the functional class of genes responsible for cell cycle progression and arrest, the profile of expression in response to butyrate was most like the profile initiated by TSA and less like that initiated by sulindac, although all three agents cause a G0-G1 cell cycle arrest. Interestingly, the cell cycle response to butyrate was least like that initiated by curcumin, which is consistent with the G2-M, rather than G0-G1, arrest caused by this compound. Finally, we extended the analysis to the Caco-2 cell line, a colonic carcinoma cell line that undergoes spontaneous G0-G1 cell cycle arrest and absorptive cell differentiation upon contact inhibition of growth (48) . For the cell cycle genes, the butyrate response was most like that of arrested Caco-2 cells and close to that of TSA. Fig. 8 ⇓ , B and C, present analyses of the cell cycle genes involved in these responses and lists their identities. These analyses confirmed the closer relationship of the underlying mechanism of cell cycle arrest between butyrate and TSA cell cycle arrest than between butyrate and sulindac and demonstrated closest similarity in mechanism between the physiological regulator butyrate in SW620 cells and the spontaneous Caco-2 cell cycle arrest. Finally, these data reflect the clear differences in mechanism of the G0-G1 cell cycle arrest induced by butyrate and the G2-M arrest induced by curcumin.
We have used microarray technology for analysis of the reprogramming that takes place when the SCFA butyrate, a physiological regulator of cell maturation in the colon, stimulates pathways of cell cycle arrest, lineage-specific differentiation, and apoptosis of colonic cells in culture (2, 3, 4) . We have compared this to response of the cells to: TSA, like butyrate an inhibitor of HDAC (13, 14, 15 ,, 30) ; sulindac, a nonsteroidal anti-inflammatory drug undergoing clinical trials to confirm and extend its utility as a chemopreventive agent for colon cancer (31, 32, 33, 34, 35) ; and curcumin, a naturally occurring chemopreventive agent found in mustard and curry (36, 37, 38, 39, 40, 41) .
The response of the cells to butyrate, which stimulates a G0-G1 cell cycle arrest, differentiation along the absorptive cell lineage, and an apoptotic cascade, is complex. There is a progressive recruitment of genes into the reprogramming as a function of time, which is similar to an expansion seen in the Caco-2 model of spontaneous differentiation (unpublished). Overall, >7% of the sequences exhibit sustained alterations in expression beyond levels defined by the distribution of 95% of the sequences in a control experiment in which two populations of untreated cells were compared. Because the cells express a complement of approximately 10,000–15,000 genes (51) , this implies that at least 1000 sequences show substantial alterations in expression, an extent of change consistent with other reports on reprogramming in differentiation and transformation (20 , 51, 52, 53, 54, 55, 56) . This is a minimal estimate of the complexity of the response for two reasons: (a) we have limited our analyses to those sequences for which altered expression persists during the experiment once the sequence is recruited into the response. There are many more sequences that show transient increases or decreases over a more limited period (not shown–entire database available on web site). These changes may also be quite important in generating and regulating the response to butyrate; and (b) we have also limited the presentation to genes showing relatively large changes in expression. In fact, ANOVA for sequences that fall within the ratios seen in comparing two control cultures (Fig. 1 ⇓ ) shows significant alterations in butyrate-treated cells compared with control (data not shown), therefore adding another, more subtle, dimension to the cell maturation program. The first conclusion, therefore, is that a physiological regulator of colonic cell maturation initiates a highly complex reprogramming of the cell.
The complexity of this reprogramming has important implications for understanding how pathways of cell cycling, lineage-specific differentiation, and cell death are coordinated to maintain homeostasis and proper functioning of the mucosa. For example, within two functional classes we have presented, genes involved in signal transduction and cell cycling, there are a large number of significant modulations in expression of many genes that are generally considered fundamentally important in pathways of cell maturation (e.g., phospholipase 3, NF-κB, integrin β5, cyclins A, B, and C, c-myc), and many of these distinguish the butyrate response from the response to sulindac. It would be straightforward to select any one of these as an example of an important change possibly critical for the response of the cells and pursue questions of its regulation and effects of its altered expression by methods now standard in molecular biology. However, the large number of such alterations suggest that each cannot be considered out of context of the overall response. In addition, for any individual sequence, measurement of expression alone cannot reflect contributions of translational or posttranslational regulation. Consequently, consideration of the interactions of expression of large numbers of genes, which would encompass altered expression of genes involved in these other levels of regulation, may more accurately reflect cell phenotype. Thus, it is the integration of many altered regulatory and functional circuits that determines the probability of a cell’s behavior, such as continuation of proliferation, or cell cycle arrest accompanied by differentiation and/or apoptosis
This complex reprogramming appears to be organized into a cascade of altered pathways, as evidenced by the smooth expansion of the magnitude of the response as a function of time and the recruitment of a large number of signaling pathways and their components. In contrast, TSA and curcumin induce more limited changes, and these are transient and do not expand with time. Sulindac also stimulates extensive alterations that increase with time, but it is clear that the response of signaling pathways, genes involved in cell cycle progression and of other genes in general shows little overlap between sulindac and butyrate. Therefore, the second conclusion is that colonic cells use links that have evolved between pathways to integrate the response to a physiological regulator (butyrate), but these links are not recruited, and the integration is abrogated, in response to novel agents that cells have not seen during evolution or development.
In characterizing the altered profiles of gene expression induced by the agents, clusters of genes could be identified that exhibited similarity in response among the four different profiles. Thus, there are underlying mechanisms that the agents have in common. The potential for uncovering particular classes of genes that are coregulated was clearly demonstrated by the use of the differing kinetics of inhibition of HDAC activity by butyrate and TSA, coupled with the profiles of gene expression specifically altered by these agents, to identify two clusters of genes that are likely to be coordinately regulated by this mechanism.
However, despite instances of similarity in response among the agents, a most striking observation was how different the responses were at a molecular level, although three of the agents, butyrate, sulindac, and TSA, all generate a G0-G1 cell cycle arrest, dissipate the Δψm, trigger an apoptotic cascade, and up-regulate β-catenin-Tcf (9) . This was most dramatically illustrated by analysis of two functional classes of genes: (a) there was little consistency in genes involved in signal transduction pathways when butyrate was compared with sulindac; (b) within a population of genes related to cell cycle progression, butyrate was shown to induce alterations in the profile of gene expression that were very similar to that induced during spontaneous Caco-2 cell maturation. However, although both butyrate and sulindac induce a G0-G1 arrest, the mechanism of sulindac-induced cell cycle arrest was quite different from that induced by butyrate, although not as different as the mechanisms of curcumin-induced G2-M arrest.
The third conclusion, therefore, is that molecular profiling of response to a physiological regulator of cell maturation can provide important information for design and interpretation of chemopreventive strategies for colon cancer in at least three ways:
(a) Agents can be characterized that have similar and different mechanisms of action, and combinations may be chosen that may be synergistic in effect, and this synergism evaluated by experiments on cell lines similar to those reported here. An excellent example of this was the analysis of the responses by N-dimensional Euclidean distance, which demonstrated greater similarity of response between butyrate and TSA and between sulindac and curcumin than between the two groups.
(b) The potential for toxicity or adverse side effects of agents may be evaluated by comparison to the response to a physiological regulator, in this case, butyrate. This is of particular importance in chemoprevention, where intervention is contemplated in essentially healthy individuals over many decades, and harmful effects must be minimal. In this regard, it is important that the response to sulindac, which is an inhibitor of gastrointestinal tumor formation, differed so markedly from that induced by butyrate, especially in regards to two important functional classes: signaling genes and cell cycle genes. Sulindac is known to have extensive side effects, including mucosal ulceration, and a potentially important association with development of rectal tumors in patients treated over lengthy periods (57, 58, 59) , and cecal tumors in mice that inherit a mutant apc allele. 7
(c) These databases on response in vitro will be useful for the evaluation of the effects of both chemopreventive and chemotherapeutic agents in clinical trials. As the databases are expanded, it will be possible to link responses in tumors to specific cellular behavior, such as pathways of cell cycle arrest, apoptosis, and lineage-specific differentiation by reference to these databases and to tailor treatment based on the knowledge of which pathways are defective in the tumor in combination with these data on which pathways respond to a given agent.
To fully exploit the information gained from these experiments, the databases, analyses, and gene clusters identified, as well as expansions of the work that will encompass many more sequences and models of lineage-specific differentiation of colonic epithelial cells and response to additional chemopreventive agents, will be made available. 8
In summary, we have extended our original observations on quantitative profiling of gene expression that characterize risk for tumor formation in the colonic mucosa and response of transformed colonic cells to the physiological inducer butyrate. The data are expanded from an approach that encompassed middle abundant and abundant sequences (20 , 21) to a greater complexity of >8,000 genes. Over a decade ago, we suggested that such quantitative profiling of gene expression would be important in understanding cell phenotype (60) and would have significant clinical utility (20 , 21 , 61) . This is now borne out by these studies, as well as elegant work from other laboratories (28 , 62, 63, 64, 65) . Continued expansion of these databases to other systems and analysis of a larger proportion of the expressed sequences will provide new approaches for strategies and evaluation of chemoprevention of colon cancer.
We thank Geoff Childs and Aldo Massimi of the Albert Einstein College of Medicine microarray facility for help in this work and Martin Lesser and Qiuhu Shi of the Division of Biostatistics, North Shore University Hospital, for important discussions regarding data analysis.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
↵1 Supported in part by Grants CA75246, CA77552, and P13330 from the National Cancer Institute and by a fellowship (to J. M.) from the American Institute for Cancer Research.
↵2 To whom requests for reprints should be addressed, at Department of Oncology, Albert Einstein Cancer Center, Montefiore Hospital, 111 East 210th St., Bronx, NY 10467. Phone: (718) 920-4663; Fax: (718) 882-4464; Email:
↵3 The abbreviations used are: SCFA, short-chain fatty acid; HDAC, histone deacetylase; TSA, trichostatin A.
↵4 All databases, analyses, and gene clusters identified are made available on our web site at http://sequence.aecom.yu.edu/genome/.
↵5 Details of the robot and methods for producing and analyzing the arrays can be found at http://sequence.aecom.yu.edu/genome/.
↵6 A listing of all of the sequences and how they are differentially expressed as a function of time is posted on our web site at http://sequence.aecom.yu.edu/genome/.
↵7 K. Yang and M. Lipkin, unpublished observations.
↵8 Internet address: http://sequence.aecom.yu.edu/genome/.
- Received April 13, 2000.
- Accepted June 29, 2000.
- ©2000 American Association for Cancer Research.