Julián Villegas
University of Aizu
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Julián Villegas.
Journal of Multimedia | 2009
Sabbir Alam; Michael Cohen; Julián Villegas; Ashir Ahmed
In this paper, we propose spatio-temporal silhouette representations, called silhouette energy image (SEI) and silhouette history image (SHI) to characterize motion and shape properties for recognition of human movements such as human actions, activities in daily life. The SEI and SHI are constructed by using the silhouette image sequence of an action. The span or difference of the end time and start time is used to make the SHI. For addressing the human shape variability, we used the variation of the anthropometry of the person. We extract the features based on geometric shape moments. We tested our approach successfully in the indoor and outdoor environment. Our experimental results show that the proposed method of human action recognition is robust, flexible and efficient.
Journal of the Acoustical Society of America | 2014
Martin Cooke; Catherine Mayo; Julián Villegas
Speech produced in the presence of noise (Lombard speech) is typically more intelligible than speech produced in quiet (plain speech) when presented at the same signal-to-noise ratio, but the factors responsible for the Lombard intelligibility benefit remain poorly understood. Previous studies have demonstrated a clear effect of spectral differences between the two speech styles and a lack of effect of fundamental frequency differences. The current study investigates a possible role for durational differences alongside spectral changes. Listeners identified keywords in sentences manipulated to possess either durational or spectral characteristics of plain or Lombard speech. Durational modifications were produced using linear or nonlinear time warping, while spectral changes were applied at the global utterance level or to individual time frames. Modifications were made to both plain and Lombard speech. No beneficial effects of durational increases were observed in any condition. Lombard sentences spoken at a speech rate substantially slower than their plain counterparts also failed to reveal a durational benefit. Spectral changes to plain speech resulted in large intelligibility gains, although not to the level of Lombard speech. These outcomes suggest that the durational increases seen in Lombard speech have little or no role in the Lombard intelligibility benefit.
virtual reality continuum and its applications in industry | 2010
Julián Villegas; Michael Cohen
HRIR~, a new software audio filter for Head-Related Impulse Response (HRIR) convolution is presented. The filter, implemented as a Pure-Data object, allows dynamic modification of a sound sources apparent location by modulating its virtual azimuth, elevation, and range in realtime, the last attribute being missing in surveyed similar applications. With hrir~ users can virtually localize monophonic sources around a listeners head in a region delimited by elevations between [--40, 90]°, and ranges between [20, 160] cm from the center of the virtual listeners head. An application based on hrir~ is presented to illustrate its benefits.
international conference on computer graphics and interactive techniques | 2013
Michael Cohen; Rasika Ranaweera; Kensuke Nishimura; Yuya Sasamoto; Shun Endo; Tomohiro Oyama; Tetsunobu Ohashi; Yukihiro Nishikawa; Ryo Kanno; Anzu Nakada; Julián Villegas; Yong Ping Chen; Sascha Holesch; Jun Yamadera; Hayato Ito; Yasuhiko Saito; Akira Sasaki
Modern smartphones and tablets have magnetometers that can be used to detect yaw, which data can be distributed to adjust ambient media. Either static (pointing) or dynamic (twirling) modes can be used to modulate multimodal displays, including 360° imagery and virtual environments. Azimuthal tracking especially allows control of horizontal planar displays, including panoramic and turnoramic imaged-based rendering, spatial sound, and the position of avatars, virtual cameras, and other objects in virtual environments such as Alice, as well as rhythmic renderings such as musical sequencing.
Journal of New Music Research | 2010
Julián Villegas; Michael Cohen
Abstract We have created a reintonation system that minimizes measured roughness of parallel sonorities as they are produced. Intonation adjustments are performed by finding, within a user-defined vicinity, a combination of fundamental frequencies that yields minimal roughness. The vicinity imposition limits pitch drift and eases real-time computation. Prior knowledge of the tones played and timbres used is not used for execution of the algorithm. We test a proof of concept prototype by adjusting equal temperament intervals rendered with a harmonic spectrum towards pure intervals in real-time. This prototype exemplifies musical and auditory characteristics of roughness minimization by adaptive techniques. We discuss the results obtained, limitations, possible improvements, and future work.
Second Language Research | 2016
Jorge González Alonso; Julián Villegas; María del Pilar García Mayo
This article reports on a study investigating the relative influence of the first language and dominant language (L1) on second language (L2) and third language (L3) morpho-lexical processing. A lexical decision task compared the responses to English NV-er compounds (e.g. taxi driver) and non-compounds provided by a group of native speakers and three groups of learners at various levels of English proficiency: L1 Spanish – L2 English sequential bilinguals and two groups of early Spanish–Basque bilinguals with English as their L3. Crucially, the two trilingual groups differed in their first and dominant language (i.e. L1 Spanish – L2 Basque vs. L1 Basque – L2 Spanish). Our materials exploit an (a)symmetry between these languages: while Basque and English pattern together in the basic structure of (productive) NV-er compounds, Spanish presents a construction that differs in directionality as well as inflection of the verbal element (V[3SG] + N). Results show between and within group differences in accuracy and response times that may be ascribable to two factors besides proficiency: the number of languages spoken by a given participant and their dominant language. An examination of response bias reveals an influence of the participants’ first and dominant language on the processing of NV-er compounds. Our data suggest that morphological information in the non-native lexicon may extend beyond morphemic structure and that, similarly to bilingualism, there are costs to sequential multilingualism in lexical retrieval.
Virtual Reality | 2015
Michael Cohen; Julián Villegas; Woodrow Barfield
The rapid advances in the technology and science of presenting spatial sound in virtual, augmented, and mixedreality environments seem to be underrepresented in recent literature. The goal of this special issue of the Virtual Reality Journal is twofold: to provide a state-of-the-art review of progress in spatial sound as applied to virtual reality (VR) and augmented reality (AR), and to stimulate further research in this emerging and important field. In this special issue, we are pleased to present papers representing a range of topics, from basic research on the perception of spatial sound to more applied papers on the use of spatial sound in real-world settings. It is the hope of the guest editors that the special issue will also encourage scientists to incorporate spatial sound into their own research and to take the next step beyond the innovative projects described here. Virtual reality, mixed reality, and augmented reality are often technology-driven. On this point, the recent emergence of affordable head-mounted displays—as exemplified by HTC and Valve Vive, Samsung Gear VR, Sony (PlayStation 4) VR, Samsung-sponsored eye-tracking FOVE, Google Cardboard, and the Facebook-acquired Oculus Rift—signals a mass diffusion of VR-style applications such as games, programs, and visualization. Presumably, tracking, alignment, and omnidirectional capabilities often found in these visual displays will foster similar developments on personalized audio displays. Considering the range of information that human senses can detect, multimodal systems that present users with a coordinated display are needed to increase realism and performance. In R&D of virtual and mixed-reality systems, the audio modality has been somewhat underemphasized, partly because the ‘‘original field’’ of virtual reality focused on the dominant (at least for some tasks) visual modality. For instance, despite its ostensible ambitions for multimodal interfaces, the Virtual Reality journal is still classified by its publisher as ‘‘Image Processing: Computer Imaging, Graphics, and Vision,’’ whereas Springer’s category ‘‘HCI: User Interfaces, HCI and Ergonomics’’ would probably be a better fit. The peer-reviewed articles presented in this special issue represent a broad consideration of themes on spatial sound, including basic research papers which we include to help provide a baseline of established scientific data in the field and ‘‘application’’ papers which are not only interesting but can be used to evaluate how well systems with spatial audio perform in realistic scenarios. These papers will be especially useful for those who design systems for realworld and pragmatic applications. Representing expanding interest in the field, the papers presented in this special issue come from innovative researchers in Asia, Europe, and North America, with a focus on recent advances in spatial ‘‘virtual’’ sound, including spatialized audio interfaces, perception, presence and cognition, navigation and way-finding, and applications of spatialized sound for VR, AR, mixed reality (MR), and presence. A deeper review of many of these topics can be found in a just-published anthology of augmented reality and wearable computers, edited by one of the special issue guest editors (Barfield 2016) and including contributions about spatial sound and augmented audio reality by the other two (Michael Cohen and Julian Villegas). & Michael Cohen [email protected]
computer and information technology | 2007
Michael Cohen; I. Jayasingha; Julián Villegas
Using multistandpoint panoramic browsers as displays, we have developed a control function that synchronizes revolution and rotation of a visual perspective around a designated point of regard in a virtual environment. The phase-locked orbit is uniquely determined by the focus and the start point, and the user can parameterize direction, step size, and cycle speed, and invoke an animated or single-stepped gesture. The images can be monoscopic or stereoscopic, and the rendering supports the usual scaling functions (zoom/unzoom). Additionally, via sibling clients that can directionalize realtime audio streams, spatialize HDD-resident audio files, or render rotation via a personal rotary motion platform, spatial sound and propriceptive sensations can be synchronized with such gestures, providing complementary multimodal displays.
Journal of the Acoustical Society of America | 2018
Ian Wilson; Julián Villegas
In an analytic-linguistic approach to teaching segmental pronunciation and articulatory setting, teachers and voice coaches give explicit instructions on tongue placement and movements. Instructors assume that learners can do exactly as instructed. This assumption was tested in research by Wilson and Horiguchi (2012, PSLLT), who showed that phonetically untrained participants were very poor at following explicit tongue movement instructions. In their study, both the magnitude and direction of movement of the tongue’s centre of gravity were calculated from 2D ultrasound images. However, by only measuring changes in the centre of gravity, it is possible that movements were found to be smaller than they really were, especially if participants focused on the front of the tongue rather than the whole tongue body. In this study, we reanalyzed the original data, this time focusing on the surface of the tongue, rather than the centre of gravity. We made tongue traces using EdgeTrak software (Li et al., 2005), and compared them using a Smoothing-Spline ANOVA method (Davidson, 2006). Results differed from the original study, showing that participants were more conscious of what the front of the tongue was doing rather than the whole tongue body. Implications for segmental pronunciation teaching will be discussed.In an analytic-linguistic approach to teaching segmental pronunciation and articulatory setting, teachers and voice coaches give explicit instructions on tongue placement and movements. Instructors assume that learners can do exactly as instructed. This assumption was tested in research by Wilson and Horiguchi (2012, PSLLT), who showed that phonetically untrained participants were very poor at following explicit tongue movement instructions. In their study, both the magnitude and direction of movement of the tongue’s centre of gravity were calculated from 2D ultrasound images. However, by only measuring changes in the centre of gravity, it is possible that movements were found to be smaller than they really were, especially if participants focused on the front of the tongue rather than the whole tongue body. In this study, we reanalyzed the original data, this time focusing on the surface of the tongue, rather than the centre of gravity. We made tongue traces using EdgeTrak software (Li et al., 2005), and...
euro american conference on telematics and information systems | 2016
Julián Villegas
A working prototype for alternative visualizations of environmental data (currently, ionizing radiation) measured with bGeigie nano Safecast sensors is presented. Contrary to previous interfaces, in this visualization users have finer control of the displayed data (i.e., can determine date ranges, compare locations, decide the averaging areas, etc.) and more detailed information of the resulting visualization (size of the samples per day and per region, etc.). With this new data visualization, it is easier to compare local environment figures with those of other regions of the planet.