Rigas Kotsakis
Aristotle University of Thessaloniki
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Rigas Kotsakis.
Speech Communication | 2012
Rigas Kotsakis; George Kalliris; Charalampos Dimoulas
The present paper focuses on the investigation of various audio pattern classifiers in broadcast-audio semantic analysis, using radio-programme-adaptive classification strategies with supervised training. Multiple neural network topologies and training configurations are evaluated and compared in combination with feature-extraction, ranking and feature-selection procedures. Different pattern classification taxonomies are implemented, using programme-adapted multi-class definitions and hierarchical schemes. Hierarchical and hybrid classification taxonomies are deployed in speech analysis tasks, facilitating efficient speaker recognition/identification, speech/music discrimination, and generally speech/non-speech detection-segmentation. Exhaustive qualitative and quantitative evaluation is conducted, including indicative comparison with non-neural approaches. Hierarchical approaches offer classification-similarities for easy adaptation to generic radio-broadcast semantic analysis tasks. The proposed strategy exhibits increased efficiency in radio-programme content segmentation and classification, which is one of the most demanding audio semantics tasks. This strategy can be easily adapted in broader audio detection and classification problems, including additional real-world speech-communication demanding scenarios.
Computers in Human Behavior | 2015
Nikos Antonopoulos; Andreas Veglis; Antonis Gardikiotis; Rigas Kotsakis; George Kalliris
Sample from media websites, 9150 respondents, Greece.Factors that predict the Web Third-person effect (WTPE) were found.Age has statistically significant effects on WTPE. In this study, the characteristics of what users observe when visiting a media website as wells as the prediction of the impact on oneself, friends and others are researched. The influence that this information has over their opinion verifies the existence of Web Third-person effect (WTPE). With the use of an online survey (N=9150) in all media websites it was proved that the variables that have a greater impact either on others or our friends than ourselves are: The number of users being concurrently online on the same media website, the exact number of users having read each article on a media website as well as the number of users having shared a news article on facebook, twitter, or other social networks. Moreover, age is a significant factor that explains the findings and is important to the effect. Additionally, factors that affect the influence of the user generated messages on others than on oneself were found. Furthermore, the more credible the news is perceived to be and when there is not a particular mediated message the WTPE is absent confirming the existing theory.
international conference on information intelligence systems and applications | 2013
Konstantinos Drossos; Rigas Kotsakis; George Kalliris; Andreas Floros
A variety of recent researches in Audio Emotion Recognition (AER) outlines high performance and retrieval accuracy results. However, in most works music is considered as the original sound content that conveys the identified emotions. One of the music characteristics that is found to represent a fundamental means for conveying emotions are the rhythm-related acoustic cues. Although music is an important aspect of everyday life, there are numerous non-linguistic and nonmusical sounds surrounding humans, generally defined as sound events (SEs). Despite this enormous impact of SEs to humans, a scarcity of investigations regarding AER from SEs is observed. There are only a few recent investigations concerned with SEs and AER, presenting a semantic connection between the former and the listeners triggered emotion. In this work we analytically investigate the connection of rhythm-related characteristics of a wide range of common SEs with the arousal of the listener using sound events with semantic content. To this aim, several feature evaluation and classification tasks are conducted using different ranking and classification algorithms. High accuracy results are obtained, demonstrating a significant relation of SEs rhythmic characteristics to the elicited arousal.
audio mostly conference | 2015
Rigas Kotsakis; A. Mislow; George Kalliris; Maria Matsiola
The current paper focuses on the discrimination of audio content, deriving from radio productions, based on the spoken language. During the implementation several audio features were extracted and subsequently evaluated, containing the spectral, timbre and tempo properties of the implicated voice signals. In this process, the differentiated patterns that appear in radio productions, such as speech signals, phone conversations and music interferences had to be initially detected and classified, leading in the employment of a prequel generic classification scheme. The hierarchical structure of discrimination integrated parametric segmentation with various window lengths, in order to detect the most efficient ones. The conducted experiments were supported by machine learning approaches, and more specifically by artificial neural networks topologies, which demonstrate increased discrimination potentials, when they are implicated in audio semantic analysis problems. The achieved overall and partial classification performances were high, revealing the saliency of the selected parameters and the efficiency of the whole implemented methodology.
international conference on information intelligence systems and applications | 2014
Rigas Kotsakis; Charalampos Dimoulas; George Kalliris; Andreas Veglis
The present paper focuses on the extraction and evaluation of salient audiovisual features for the prediction of the encoding requirements in multimedia learning content. Decisions over audiovisual encoding are related to the perceived quality of experience (QoE), but also to the physical attributes of initial material (i.e. resolution, color range, motion activity, audio dynamic range, bandwidth, etc.). Recent research showed that such decisions can be really crucial during the production of audiovisual e-learning material, where poor encoding may lead to unaccepted QoE or even to the creation of negative emotional response. On the other hand, exaggerated high quality encoding may create increased bandwidth demands that are associated with annoying delays and irregular playback flow, resulting again in QoE degradation with similar emotional repulsion. Thus, there has to be a careful treatment with proper encoding balance during the production of both the networked distance learning and stand-alone audiovisual mediated resources. Such machine creativity strategies are investigated in the current work with the utilization of applicable audiovisual features, QoE metrics and emotional measures. The current work is part of a broader research, aiming at implementing intelligent models for optimal audiovisual production and encoding configuration, with respect to the source content attributes, the requested quality of experience (and learning) and the related emotional properties.
International Journal of Monitoring and Surveillance Technologies Research archive | 2014
Andreas Veglis; George Kalliris; Charalampos Dimoulas; Rigas Kotsakis
The present paper focuses on the extraction and evaluation of salient audiovisual features for the prediction of the encoding requirements in audiovisual content. Recent research showed that encoding decisions can be really crucial during audiovisual mediated communication, where poor encoding may lead to unaccepted Quality of Experience QoE or even to the creation of negative emotional response. In contrast, exaggerated high quality encoding may create increased bandwidth demands that are associated with annoying delays and irregular playback flow, resulting again in QoE degradation with similar emotional repulsion. Thus, there has to be a careful treatment with proper encoding balance during the production and deployment of mediated communication audiovisual resources. Such machine-assisted creativity is investigated in the current work, with the utilization of applicable audiovisual features, QoE metrics and emotional measures, aiming at implementing intelligent models for optimal audiovisual production and encoding configuration in demanding mediated communication applications and services.
audio mostly conference | 2016
Nikolaos Vryzas; Rigas Kotsakis; Charalampos Dimoulas; George Kalliris
The current paper investigates a multisensory speaker tracking approach, combining sound localization with visual object detection and tracking. The sound localization module estimates the position of the speaker whenever a new spatial audio event is detected (i.e. sound source position /speaker alteration). Besides localization, spatial audio events can be detected /verified through various decision-making systems utilizing multichannel audio features. Visual object detection and tracking is also considered, either in parallel with the sound system or subsequently, after the initial sound localization. The case scenario examined in this paper consists of energy-based sound localization using a core cross-shaped coincident microphone array combined with state of the art machine vision, such as the OpenCV face detection pre-trained classifiers and the Open Tracking and Learning Detection (openTLD) framework. A modular multi-sensory architecture is involved, allowing microphone array(s) to be combined with multi-camera sequences and other signals (i.e. depth /motion imaging). The proposed approach is presented and demonstrated in focused real-world scenarios (i.e. cultural /theatrical shows capturing and live-streaming, meetings and press-conferences, recording and broadcast of video lectures, teleconferences, etc.).
Journal of The Audio Engineering Society | 2012
Rigas Kotsakis; George Kalliris; Charalampos Dimoulas
2012 Seventh International Workshop on Semantic and Social Media Adaptation and Personalization | 2012
Rigas Kotsakis; Charalampos Dimoulas; George Kalliris
audio mostly conference | 2015
Efstathios A. Sidiropoulos; Evdokimos I. Konstantinidis; Rigas Kotsakis; Andreas Veglis