Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Pedro J. Moreno is active.

Publication


Featured researches published by Pedro J. Moreno.


international acm sigir conference on research and development in information retrieval | 2001

Topic segmentation with an aspect hidden Markov model

David M. Blei; Pedro J. Moreno

We present a novel probabilistic method for topic segmentation on unstructured text. One previous approach to this problem utilizes the hidden Markov model (HMM) method for probabilistically modeling sequence data [7]. The HMM treats a document as mutually independent sets of words generated by a latent topic variable in a time series. We extend this idea by embedding Hofmanns aspect model for text [5] into the segmenting HMM to form an aspect HMM (AHMM). In doing so, we provide an intuitive topical dependency between words and a cohesive segmentation model. We apply this method to segment unbroken streams of New York Times articles as well as noisy transcripts of radio programs on SpeechBot, an online audio archive indexed by an automatic speech recognition engine. We provide experimental comparisons which show that the AHMM outperforms the HMM for this task.


international conference on multimedia and expo | 2004

Semantic analysis of song lyrics

Andrew Kositsky; Pedro J. Moreno

We explore the use of song lyrics for automatic indexing of music. Using lyrics mined from the Web, we apply a standard text processing technique to characterize their semantic content. We then determine artist similarity in this space. We found lyrics can be used to discover natural genre clusters. Experiments on a publicly available set of 399 artists showed that determining artist similarity using lyrics is better than random, but inferior to a state-of-the-art acoustic similarity technique. However the approaches made different errors, suggesting they could be profitably combined


IEEE Transactions on Multimedia | 2002

Speechbot: an experimental speech-based search engine for multimedia content on the web

J.-M. Van Thong; Pedro J. Moreno; B. Fidler; K. Maffey; Matthew T. Moores

As the Web transforms from a text-only medium into a more multimedia-rich medium, the need arises to perform searches based on the multimedia content. In this paper, we present an audio and video search engine to tackle this problem. The engine uses speech recognition technology to index spoken audio and video files from the World Wide Web (WWW) when no transcriptions are available. If transcriptions (even imperfect ones) are available, we can also take advantage of them to improve the indexing process. Our engine indexes several thousand talk and news radio shows covering a wide range of topics and speaking styles from a selection of public Web sites with multimedia archives. Our Web site is similar in spirit to normal Web search sites; it contains an index, not the actual multimedia content. The audio from these shows suffers in acoustic quality due to bandwidth limitations, coding, compression, and poor acoustic conditions. Our word error rate (WER) results using appropriately trained acoustic models show remarkable resilience to the high compression, although many factors combine to increase the average WERs over standard broadcast news benchmarks. We show that, even if the transcription is inaccurate, we can still achieve good retrieval performance for typical user queries (77.5%).


IEEE Computer | 2002

From multimedia retrieval to knowledge management

Pedro J. Moreno; J.-M. Van Thong; Gareth J. F. Jones

Developments in information retrieval technologies can make multimedia data as pervasive and important as textual sources in knowledge management systems. The authors suggest ways in which speech-based multimedia information retrieval technologies can evolve into full-fledged knowledge management systems.


international conference on multimedia and expo | 2004

News Tuner: a simple interface for searching and browsing radio archives

Jon Marston; Gavin Maccarthy; Pedro J. Moreno; J.-M. Van Thong

We present in this paper a new Web-based application, called the News Tuner, for searching and browsing large radio archives. While popular search engines provide means for finding text and images, our approach combines semantic and acoustic search for efficient retrieval of audio documents. Semantic search allows the user to retrieve stories for a given concept, while acoustic search allows random access within stored audio files. Our experiments on over 1700 programs show that our method is effective at quickly retrieving stories that would be difficult to find otherwise. The News Tuner paradigm is intended primarily for news and talk radio programs, however it may be applied to browsing and searching any spoken word audio content


international conference of the ieee engineering in medicine and biology society | 2004

Reducing the cost of protein identifications from mass spectrometry databases

Leonidas I. Kontothanassis; David Goddeau; Pedro J. Moreno; R. Hookway; D. Sarracino

We present two techniques to improve the computational efficiency of protein discovery from mass spectrometry databases: noise filtering and hierarchical searching. Our approaches are orthogonal to existing algorithms and are based on the observation that typical mass spectrometry data contains a large amount of noise that can lead to wasteful computation. Our first improvement uses standard machine learning techniques with novel feature vectors derived from the mass spectra to identify and filter the noisy spectra. We demonstrate this approach results in computational gains of around 38% with less than 10% loss of peptides. Additionally we present a hierarchical searching scheme in which most samples are matched against a small database at low computational cost, leaving only a small number of samples to be searched against larger databases. Combining this scheme with the machine learning filters leads to a further performance improvement of 3%.


international conference of the ieee engineering in medicine and biology society | 2004

Protein seer: a Web server for protein homology detection

U. Karaoz; Pedro J. Moreno; Zhiping Weng; Simon Kasif

We present and evaluate a publicly available Web server which classifies protein sequences into SCOP 1.63 PDB95 structural superfamilies. The Website returns ranked lists of likely superfamilies and hence implicit structural predictions according to three computational techniques: BLAST, HMMER and a discriminative classifier SVM-BLOCKS. It is the first Website to provide predictions using SVM-BLOCKS. In addition to the ranked lists, the Website displays alignment information and a Web services interface is also available for computationally intensive use. We conduct a large-scale evaluation which mimics the predictions returned by the Website. The study indicates that the site provides valid predictions and that SVM-BLOCKS approach can outperform BLAST and HMMER when sufficient examples are available to learn the SVM classifiers.


neural information processing systems | 2003

A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications

Pedro J. Moreno; Purdy Ho; Nuno Vasconcelos


Archive | 1997

Environmently compensated speech processing

Brian Eberman; Pedro J. Moreno


Archive | 2000

Computer method and apparatus for segmenting text streams

Pedro J. Moreno; David M. Blei

Collaboration


Dive into the Pedro J. Moreno's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Matthew T. Moores

Queensland University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge