Naomi Altman
Pennsylvania State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Naomi Altman.
Journal of the American Statistical Association | 1990
Naomi Altman
Abstract Kernel smoothing is a common method of estimating the mean function in the nonparametric regression model y = f(x) + e, where f(x) is a smooth deterministic mean function and e is an error process with mean zero. In this article, the mean squared error of kernel estimators is computed for processes with correlated errors, and the estimators are shown to be consistent when the sequence of error processes converges to a mixing sequence. The standard techniques for bandwidth selection, such as cross-validation and generalized cross-validation, are shown to perform very badly when the errors are correlated. Standard selection techniques are shown to favor undersmoothing when the correlations are predominantly positive and oversmoothing when negative. The selection criteria can, however, be adjusted to correct for the effect of correlation. In simulations, the standard selection criteria are shown to behave as predicted. The corrected criteria are shown to be very effective when the correlation functi...
Plant Physiology | 2004
Guanfang Wang; Hongzhi Kong; Yujin Sun; Xiaohong Zhang; Wei Zhang; Naomi Altman; Claude W. dePamphilis; Hong Ma
Cyclins are primary regulators of the activity of cyclin-dependent kinases, which are known to play critical roles in controlling eukaryotic cell cycle progression. While there has been extensive research on cell cycle mechanisms and cyclin function in animals and yeasts, only a small number of plant cyclins have been characterized functionally. In this paper, we describe an exhaustive search for cyclin genes in the Arabidopsis genome and among available sequences from other vascular plants. Based on phylogenetic analysis, we define 10 classes of plant cyclins, four of which are plant-specific, and a fifth is shared between plants and protists but not animals. Microarray and reverse transcriptase-polymerase chain reaction analyses further provide expression profiles of cyclin genes in different tissues of wild-type Arabidopsis plants. Comparative phylogenetic studies of 174 plant cyclins were also performed. The phylogenetic results imply that the cyclin gene family in plants has experienced more gene duplication events than in animals. Expression patterns and phylogenetic analyses of Arabidopsis cyclin genes suggest potential gene redundancy among members belonging to the same group. We discuss possible divergence and conservation of some plant cyclins. Our study provides an opportunity to rapidly assess the position of plant cyclin genes in terms of evolution and classification, serving as a guide for further functional study of plant cyclins.
Nature Methods | 2014
Martin Krzywinski; Naomi Altman
of quartiles for box plots is a well-established convention: boxes or whiskers should never be used to show the mean, s.d. or s.e.m. As with the division of the box by the median, the whiskers are not necessarily symmetrical (Fig. 1b). The 1.5 multiplier corresponds to approximately ±2.7s (where s is s.d.) and 99.3% coverage of the data for a normal distribution. Outliers beyond the whiskers may be individually plotted. Box plot construction requires a sample of at least n = 5 (preferably larger), although some software does not check for this. For n < 5 we recommend showing the individual data points. Sample size differences can be assessed by scaling the box plot width in proportion to √n (Fig. 1b), the factor by which the precision of the sample’s estimate of population statistics improves as sample size is increased. To assist in judging differences between sample medians, a notch (Fig. 1b) can be used to show the 95% confidence interval (CI) for the median, given by m ± 1.58 × IQR/√n (ref. 1). This is an approximation based on the normal distribution and is accurate in large samples for other distributions. If you suspect the population distribution is not close to normal and your sample size is small, avoid interpreting the interval analytically in the way we have described for CI error bars2. In general, when notches do not overlap, the medians can be judged to differ significantly, but overlap does not rule out a significant difference. For small samples the notch may span a larger interval than the box (Fig. 2). The exact position of box boundaries will be software dependent. First, there is no universally agreedupon method to calculate quartile values, which may be based on simple averaging or linear interpolation. Second, some applications, such as R, use hinges instead of quartiles for box boundaries. The lower and upper hinges are the median of the Points of siGnifiCAnCE
Journal of Statistical Planning and Inference | 1995
Naomi Altman; Christian Léger
Abstract Leave-one-out cross-validation is a popular and readily implemented heuristic for bandwidth selection in nonparametric smoothing problems. In this note we elucidate the role of leave-one-out selection criteria by discussing a criterion introduced by Sarda (J. Statist. Plann. Inference 35 (1993) 65–75) for bandwidth selection for kernel distribution function estimators (KDFEs). We show that for this problem, use of the leave-one-out KDFE in the selection procedure is asymptotically equivalent to leaving none out. This contrasts with kernel density estimation, where use of the leave-one-out density estimator in the selection procedure is critical. Unfortunately, simulations show that neither method works in practice, even for samples of size as large as 1000. In fact, we show that for any fixed bandwidth, the expected value of the derivative of the leave-none-out criterion is asymptotically positive. This result and our simulations suggest that the criteria are increasing and that for sufficiently large samples (e.g., n = 100), the smallest available bandwidth will always be selected, thus contradicting the optimality result of Sarda for this estimator. As an alternative to minimizing a selection criterion, we propose a plug-in estimator of the asymptotically optimal bandwidth. Simulations suggest that the plug-in is a good estimator of the asymptotically optimal bandwidth even for samples as small as 10 observations and is not too far from the finite sample bandwidth.
Ecology | 2002
Alexander S. Flecker; Brad W. Taylor; Emily S. Bernhardt; James M. Hood; William K. Cornwell; Shawn R. Cassatt; Michael J. Vanni; Naomi Altman
Ecologists have long been interested in understanding the strengths of con- sumer and resource limitation in influencing communities. Here we ask three questions concerning the relative importance of nutrients and grazing fishes to primary producers of a tropical Andean stream: (1) Are stream algae nutrient limited? (2) Are top-down and bottom-up forces of dual importance in limiting primary producers? (3) Do grazing fishes modulate the degree of resource limitation? We obtained several lines of evidence suggesting that Andean stream algae are nitrogen limited. Addition of nitrogen in flow-through channels resulted in major increases in algal standing crop, whereas there were no measurable effects of phosphorus enrichment. Inter- estingly, the N2-fixing cyanobacteria Anabaena was one of the taxa that responded most dramatically to the addition of nitrogen. Moreover, nutrient uptake rates were significantly higher for inorganic nitrogen (NO3-N and NH4-N) compared to phosphorus (PO4-P). Nutrients and the presence of grazing fishes were manipulated simultaneously in a series of experiments by using nutrient-diffusing substrates in fish exclusions vs. open cages accessible to the natural fish assemblage. We observed strong effects of both nitrogen addition and consumers on algal standing crop, although consumer limitation was found to be of considerably greater magnitude than resource limitation in influencing algal biomass and composition. Finally, the degree of resource limitation varied as a consequence of grazing fishes. Experiments examining nutrient limitation in the presence and absence of fishes showed that the response to nitrogen enrichment was significantly greater on sub- strates accessible to natural fish assemblages compared to substrates where grazing fishes were excluded. These experiments demonstrate simultaneous and interactive effects of top- down and bottom-up factors in limiting primary producers of tropical Andean streams. Whereas other studies have shown that consumers affect nutrient supply in ecosystems, our findings suggest that consumers can play an important role in influencing nutrient
International Journal of Food Microbiology | 1990
Linda Wimpfheimer; Naomi Altman; Joseph H. Hotchkiss
The development of Listeria monocytogenes Scott A, serotype 4 and aerobic plate counts on minced raw chicken were determined independently at 4, 10 and 27 degrees C. Samples were packaged in flexible film under two modified atmospheres (one containing oxygen and one containing no oxygen) or air. The anaerobic modified atmosphere (75:25, CO2:N2) resulted in the failure of both the aerobic plate counts and L. monocytogenes to grow at all temperatures. Both the L. monocytogenes and aerobic plate counts grew in air at all temperatures. The aerobic modified atmosphere (72.5:22.5:5, CO2:N2:O2), which more closely duplicates commercial practice, inhibited the increase in aerobic plate counts by more than 4 log10 cfu/g compared to air at 4 degrees C. However, the L. monocytogenes was not affected by this atmosphere and increased in numbers by nearly 6 log10 cfu/g at 4 degrees C in 21 days. Regression analysis of the log10 growth and 95% confidence intervals showed that the differences between aerobic plate counts and L. monocytogenes in modified atmosphere were large. The ability of L. monocytogenes to grow in the aerobic modified atmosphere was not affected by level of the L. monocytogenes inoculum nor by the initial level of aerobic plate counts. These data show that modified atmosphere packaging of raw chicken (and probably other meats) can substantially inhibit the aerobic spoilage flora while allowing pathogenic L. monocytogenes to increase.
Nucleic Acids Research | 2007
P. Kerr Wall; Jim Leebens-Mack; Kai Müller; Dawn Field; Naomi Altman; Claude W. dePamphilis
The PlantTribes database (http://fgp.huck.psu.edu/tribe.html) is a plant gene family database based on the inferred proteomes of five sequenced plant species: Arabidopsis thaliana, Carica papaya, Medicago truncatula, Oryza sativa and Populus trichocarpa. We used the graph-based clustering algorithm MCL [Van Dongen (Technical Report INS-R0010 2000) and Enright et al. (Nucleic Acids Res. 2002; 30: 1575–1584)] to classify all of these species’ protein-coding genes into putative gene families, called tribes, using three clustering stringencies (low, medium and high). For all tribes, we have generated protein and DNA alignments and maximum-likelihood phylogenetic trees. A parallel database of microarray experimental results is linked to the genes, which lets researchers identify groups of related genes and their expression patterns. Unified nomenclatures were developed, and tribes can be related to traditional gene families and conserved domain identifiers. SuperTribes, constructed through a second iteration of MCL clustering, connect distant, but potentially related gene clusters. The global classification of nearly 200 000 plant proteins was used as a scaffold for sorting ∼4 million additional cDNA sequences from over 200 plant species. All data and analyses are accessible through a flexible interface allowing users to explore the classification, to place query sequences within the classification, and to download results for further study.
Plant Molecular Biology | 2005
Xiaohong Zhang; Baomin Feng; Qing Zhang; Diya Zhang; Naomi Altman; Hong Ma
We have used oligonucleotide microarrays to detect Arabidopsis gene expression during early flower development. Among the 22,746 genes represented on the Affymetrix ATH1 chip, approximately 14,660 (64.5%) genes were expressed with signal intensity at or more than 50 in each of the six organs/structures examined, including young inflorescences (floral stages 1–9), stage-12 floral buds, developing siliques, leaves, stems, and roots. 17,583 genes were expressed with an intensity at or above 50 in at least one tissue, including 12,245 genes that were expressed in all the six tissues. Comparison of genes expressed between young inflorescence or stage-12 floral buds with other tissues suggests that relatively large numbers of genes are expressed at similar levels in tissues that are related morphologically and/or developmentally, as supported by a cluster analysis with data from two other studies. Further analysis of the genes preferentially expressed in floral tissues has uncovered new genes potentially involved in Arabidopsis flower development. One hundred and four genes were determined to be preferentially expressed in young inflorescences, including 22 genes encoding putative transcription factors. We also identified 105 genes that were preferentially expressed in three reproductive structures (the young inflorescences, stage-12 floral buds and developing siliques), when compared with the vegetative tissues. RT-PCR results of selected genes are consistent with the results from these microarrays and suggest that the relative signal intensities detected with the Affymetrix microarray are reliable estimates of gene expression.
Nature Methods | 2013
Martin Krzywinski; Naomi Altman
The meaning of error bars is often misinterpreted, as is the statistical significance of their overlap.
Proceedings of the National Academy of Sciences of the United States of America | 2009
Xinwei Han; Xia Wu; Wen-Yu Chung; Tao Li; Anton Nekrutenko; Naomi Altman; Gong Chen; Hong Ma
Brain structure and function experience dramatic changes from embryonic to postnatal development. Microarray analyses have detected differential gene expression at different stages and in disease models, but gene expression information during early brain development is limited. We have generated >27 million reads to identify mRNAs from the mouse cortex for >16,000 genes at either embryonic day 18 (E18) or postnatal day 7 (P7), a period of significant synaptogenesis for neural circuit formation. In addition, we devised strategies to detect alternative splice forms and uncovered more splice variants. We observed differential expression of 3,758 genes between the 2 stages, many with known functions or predicted to be important for neural development. Neurogenesis-related genes, such as those encoding Sox4, Sox11, and zinc-finger proteins, were more highly expressed at E18 than at P7. In contrast, the genes encoding synaptic proteins such as synaptotagmin, complexin 2, and syntaxin were up-regulated from E18 to P7. We also found that several neurological disorder-related genes were highly expressed at E18. Our transcriptome analysis may serve as a blueprint for gene expression pattern and provide functional clues of previously unknown genes and disease-related genes during early brain development.