Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where T. V. Raman is active.

Publication


Featured researches published by T. V. Raman.


international world wide web conferences | 1997

Cascaded speech style sheets

T. V. Raman

Abstract Cascading Style Sheets (CSS) enable WWW designers to separate layout from content on a WWW site and help the site designer customize the look and feel of a site without having to edit all the pages making up the site. Stylesheets are often thought of as a means to specify the visual appearance of a WWW page. This paper takes a more general view; CSS style sheets can in fact be used equally well to control the appearance of a WWW site when presented in non-traditional modalities such as speech. This paper outlines the reasoning behind the design of the speech style sheet specification and describes a working implementation that produces high-quality audio formatted spoken renderings of well-authored WWW content. The paper reinforces the need to keep WWW site design independent of specific browser implementations of today by demonstrating the ability to specify aural renderings that can in principle be completely separate from the visual appearance of a WWW page given a well-structured collection of HTML documents.


human factors in computing systems | 1996

Emacspeak—a speech interface

T. V. Raman

Screen-readers-computer software that enables a visually impaired user to read the contents of a visual display-have been available for more than a decade. Screen-readers are separate from the user application. Consequently, they have little or no contextual information about the contents of the display. The author has used traditional screen-reading applications for the last five years. The design of the speech-enabling approach described here has been implemented in Emacspeak to overcome many of the shortcomings he has encountered with traditional screen-readers. The approach used by Emacspeak is very different from that of traditional screen-readers. Screen-readers allow the user to listen to the contents appearing in different parts of the display; but the user is entirely responsible for building a mental model of the visual display in order to interpret what an application is trying to convey. Emacspeak, on the other hand, does not speak the screen. Instead, applications provide both visual and speech feedback, and the speech feedback is designed to be sufficient by itself: This approach reduces cognitive load on the user and is relevant to providing general spoken access to information. Producing spoken output from within the application, rather than speaking the visually displayed information, vastly improves the quality of the spoken feedback. Thus, an application can display its results in a visually pleasing manner; the speech-enabling component renders the same in an aurally pleasing way. Permission to make digital/hard copies of all or parl of this material for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication and its date appear, and notice is Introduetlon A screen-reader is a computer application designed to provide spoken feedback to a visually impaired user. Screen-readers have been available since the mid-80s. During the 80s, such applications relied on the character representation of the contents of the screen to produce the spoken feedback. The advent of bitmap displays led to a complete breakdown of this approach, since the contents of the screen were now light and dark pixels. A significant amount of research and development has been carried out to overcome this problem and provide speech-access to the Graphical User Interface (GUI). The best and perhaps the most complete speech access system to the GUI is Screenreader/2 (ScreenReader For OS/2) developed by Dr. Jim Thatcher at the IBM Wat-son Research …


conference on computers and accessibility | 1996

Emacspeak—direct speech access

T. V. Raman

Emacspeak is a full-fledged speech output inter- face to Emacs, and is being used to provide direct speech access to a UNIX workstation. The kind of speech access provided by Emacspeak is qual- itatively different from what conventional screen- readers provide -emacspeak makes applications speak- as opposed to speaking the screen. Emacspeak is the first full-fledged speech output system that will allow someone who cannot see to work directly on a UNIX system (Until now, the only option available to visually impaired users has been to use a talking PC as a terminal.) Emacspeak is built on top of Emacs. Once Emacs is started, the user gets complete spoken feedback. I currently use Emacspeak at work on my SUN SparcStation and have also used it on a DECAL- FHA workstation under Digital UNIX while at Di- gitals CRL 1 . I also use Emacspeak as the only speech output system on my laptop running Linux. Emacspeak is available on the Internet: ftp://crl.dec.com/pub/digitallemacspeak/ http://www.research.digital.com/CRL


conference on computers and accessibility | 1994

Interactive audio documents

T. V. Raman; David Gries

Communicating technical material orally is often hindered by the relentless linearity of audio; information flows <italic>actively</italic> past a passive listener. This is in stark contrast to communication through the printed medium, where we can actively peruse the visual display to access relevant information. A<subscrpt>S</subscrpt>T<subscrpt>E</subscrpt>R is an interactive computing system for <italic>audio formatting</italic> electronic documents (presently, documents written in (L<supscrpt>A</supscrpt>)T<subscrpt>E</subscrpt>X) to produce audio documents. A<subscrpt>S</subscrpt>T<subscrpt>E</subscrpt>R can speak both literary texts and highly technical documents that contain complex mathematics. In fact, the effective speaking and interactive browsing of mathematics is a key goal of A<subscrpt>S</subscrpt>T<subscrpt>E</subscrpt>R. To this end, a listener can browse both complete documents and complex mathematical expressions. A<subscrpt>S</subscrpt>T<subscrpt>E</subscrpt>R thus enables <italic>active</italic> listening. This paper describes the browsing component of A<subscrpt>S</subscrpt>T<subscrpt>E</subscrpt>R. The design and implementation of A<subscrpt>S</subscrpt>T<subscrpt>E</subscrpt>R is beyond the scope of this paper. Here, we will focus on the browser, and refer to other parts of the system in passing for the sake of completeness.


International Journal of Speech Technology | 1995

Audio formatting—Making spoken text and math comprehensible

T. V. Raman; David Gries

ASTER is an interactive computing system foraudio formatting electronic documents (presently, documents written in (LA)TEX) to produce audio documents. ASTER can speak both literary texts and highly technical documents that contain complex mathematics. In fact, the effective speaking of mathematics is a key goal of ASTER. To this end, a listener can request that segments of text or mathematics be spoken using several different rendering styles, in an interactive fashion. Listeners can themselves construct rendering rules and styles, if they feel it necessary.In this paper, we describe the rendering component of ASTER—the system for writing rules for speaking various parts of text and mathematics—and discuss some of the principles that were used in developing rules for making spoken text, mathematics, and tables comprehensible.Visual communication is characterized by the eyes ability to actively access parts of a two-dimensional display. The reader is active, while the display is passive. This active-passive role is reversed by the temporal nature of oral communication: information flows actively past a passive listener. This prohibits multiple views—it is impossible to first obtain a high-level view and then “look” at details. These shortcomings become severe when presenting complex mathematics orally.Audio formatting, which renders information structure in a manner attuned to an auditory display, overcomes these problems. Audio layout, composed of fleeting and persistent cues, conveys complex structure without detracting from the content. ASTER is interactive, and the ability to browse information structure and obtain multiple views enablesactive listening.


conference on computers and accessibility | 1998

Conversational gestures for direct manipulation on the audio desktop

T. V. Raman

We describe the speech-enabling approach to building auditory interfaces that treat speech as a first-class modality. The process of designing effective auditory interfaces is decomposed into identifying the atomic actions that make up the user interaction and the conversational gestures that enable these actions. The auditory interface is then synthesized by mapping these conversational gestures to appropriate primitives in the auditory environment. We illustrate this process with a concrete example by developing an auditory interface to the visually intensive task of playing tetris. Playing Tetris is a fun activity that has many of the same demands as day-to-day activities on the electronic desktop. Speech-enabling Tetris thus not only provides a fun way to exercise ones geometric reasoning abilities - it provides useful lessons in speech-enabling commonplace computing tasks.


human factors in computing systems | 1996

Universal design: everyone has special needs

Eric D. Bergman; Alistair D. N. Edwards; Deborah Kaplan; Greg Lowney; T. V. Raman; Earl Johnson

Despite high profile discussions of user-centered design in the CHI community, until recently a substantial population of users has been largely ignored. Users who have restricted or no use of hands, eyes, ears, or voice due to environment, task context, repetitive strain injury, or disability constitute a diverse and significant user population, but these users receive relatively little mention in mainstream HCI conferences or literature. Design considerations for users with vision, hearing, or movement impairments overlap with those for the general population across a variety of tasks and contexts (e.g., high workload tasks, automobile systems, phone interfaces). Following on this theme, the panel will promote discussion of so-called “Universal Design” -design for the broadest possible range of users.


Archive | 1997

Concrete Implementation of an Audio Desktop

T. V. Raman

In chapter 3 we introduced the concept of an audio desktop and outlined the basic building blocks that go to make up a fluent auditory interface. The focus was on designing an audio desktop independent of any specific implementation. An audio desktop was characterized in terms of the basic user-level functionality such an environment needs to enable and the tools and techniques that can be used in achieving these goals.


Multimedia Systems | 1995

Audio formatting—presenting structured information aurally

T. V. Raman; David Gries

We have developed a computing system that takes a LATEX source as input and speaks it. The system is interactive in that the user can browse the document to listen to the parts that most interest him. Special attention has been given to the speaking of mathematical formulas; in this realm, the system outperforms humans. The system is designed primarily for the sight-impaired, but it has a much broader potential. AFL, the audio analogue of PostScript (Adobe Systems), for paper output is smaller than PostScript and consists of a simple block-structured language in which one writes commands that cause words to be spoken and sounds to be played. AFL is used to vary output parameters such as the speed of the spoken word, the pitch of the voice, and the length of pauses. AFL also synchronizes various sound components. The presence of AFL has allowed us to experiment extensively with various ways of speaking mathematics to arrive at effective audio renderings. The design of AFL is the focus of this paper.


Archive | 1997

Nuts and Bolts of Auditory Interfaces

T. V. Raman

This chapter covers various tools and techniques relevant to the design of effective auditory interfaces. It is designed to be a brief overview of currently available technology in this field. The topics introduced here will be used to advantage throughout this book in describing various kinds of auditory interaction. An excellent and frequently updated source of information in this field can be found in the monthly Frequently Asked Questions (FAQ) posting to the Usenet newsgroup comp.speech — a hypertext version can be found on the WWW at URL http://www.speech.cs.cm.edu/comp.speech/.

Collaboration


Dive into the T. V. Raman's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge