Vahid Sedighi
Binghamton University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Vahid Sedighi.
international workshop on information forensics and security | 2014
Tomás Denemark; Vahid Sedighi; Vojtech Holub; Rémi Cogranne; Jessica J. Fridrich
From the perspective of signal detection theory, it seems obvious that knowing the probabilities with which the individual cover elements are modified during message embedding (the so-called probabilistic selection channel) should improve steganalysis. It is, however, not clear how to incorporate this information into steganalysis features when the detector is built as a classifier. In this paper, we propose a variant of the popular spatial rich model (SRM) that makes use of the selection channel. We demonstrate on three state-of-the-art content-adaptive steganographic schemes that even an imprecise knowledge of the embedding probabilities can substantially increase the detection accuracy in comparison with feature sets that do not consider the selection channel. Overly adaptive embedding schemes seem to be more vulnerable than schemes that spread the embedding changes more evenly throughout the cover.
IEEE Transactions on Information Forensics and Security | 2016
Vahid Sedighi; Rémi Cogranne; Jessica J. Fridrich
Most current steganographic schemes embed the secret payload by minimizing a heuristically defined distortion. Similarly, their security is evaluated empirically using classifiers equipped with rich image models. In this paper, we pursue an alternative approach based on a locally estimated multivariate Gaussian cover image model that is sufficiently simple to derive a closed-form expression for the power of the most powerful detector of content-adaptive least significant bit matching but, at the same time, complex enough to capture the non-stationary character of natural images. We show that when the cover model estimator is properly chosen, the state-of-the-art performance can be obtained. The closed-form expression for detectability within the chosen model is used to obtain new fundamental insight regarding the performance limits of empirical steganalysis detectors built as classifiers. In particular, we consider a novel detectability limited sender and estimate the secure payload of individual images.
electronic imaging | 2015
Vahid Sedighi; Jessica J. Fridrich; Rémi Cogranne
The vast majority of steganographic schemes for digital images stored in the raster format limit the amplitude of embedding changes to the smallest possible value. In this paper, we investigate the possibility to further improve the empirical security by allowing the embedding changes in highly textured areas to have a larger amplitude and thus embedding there a larger payload. Our approach is entirely model driven in the sense that the probabilities with which the cover pixels should be changed by a certain amount are derived from the cover model to minimize the power of an optimal statistical test. The embedding consists of two steps. First, the sender estimates the cover model parameters, the pixel variances, when modeling the pixels as a sequence of independent but not identically distributed generalized Gaussian random variables. Then, the embedding change probabilities for changing each pixel by 1 or 2, which can be transformed to costs for practical embedding using syndrome-trellis codes, are computed by solving a pair of non-linear algebraic equations. Using rich models and selection-channel-aware features, we compare the security of our scheme based on the generalized Gaussian model with pentary versions of two popular embedding algorithms: HILL and S-UNIWARD.
Proceedings of SPIE | 2014
Jan Kodovský; Vahid Sedighi; Jessica J. Fridrich
When a steganalysis detector trained on one cover source is applied to images from a different source, generally the detection error increases due to the mismatch between both sources. In steganography, this situation is recognized as the so-called cover source mismatch (CSM). The drop in detection accuracy depends on many factors, including the properties of both sources, the detector construction, the feature space used to represent the covers, and the steganographic algorithm. Although well recognized as the single most important factor negatively affecting the performance of steganalyzers in practice, the CSM received surprisingly little attention from researchers. One of the reasons for this is the diversity with which the CSM can manifest. On a series of experiments in the spatial and JPEG domains, we refute some of the common misconceptions that the severity of the CSM is tied to the feature dimensionality or their “fragility.” The CSM impact on detection appears too difficult to predict due to the effect of complex dependencies among the features. We also investigate ways to mitigate the negative effect of the CSM using simple measures, such as by enlarging the diversity of the training set (training on a mixture of sources) and by employing a bank of detectors trained on multiple different sources and testing on a detector trained on the closest source.
international workshop on information forensics and security | 2015
Rémi Cogranne; Vahid Sedighi; Jessica J. Fridrich; Tomáš Pevný
The ensemble classifier, based on Fisher Linear Discriminant base learners, was introduced specifically for steganalysis of digital media, which currently uses high-dimensional feature spaces. Presently it is probably the most used method to design supervised classifier for steganalysis of digital images because of its good detection accuracy and small computational cost. It has been assumed by the community that the classifier implements a non-linear boundary through pooling binary decision of individual classifiers within the ensemble. This paper challenges this assumption by showing that linear classifier obtained by various regularizations of the FLD can perform equally well as the ensemble. Moreover it demonstrates that using state of the art solvers linear classifiers can be trained more efficiently and offer certain potential advantages over the original ensemble leading to much lower computational complexity than the ensemble classifier. All claims are supported experimentally on a wide spectrum of stego schemes operating in both the spatial and JPEG domains with a multitude of rich steganalysis feature sets.
electronic imaging | 2017
Vahid Sedighi; Jessica J. Fridrich
Feature-based steganalysis has been an integral tool for detecting the presence of steganography in communication channels for a long time. In this paper, we explore the possibility to utilize powerful optimization algorithms available in convolutional neural network packages to optimize the design of rich features. To this end, we implemented a new layer that simulates the formation of histograms from truncated and quantized noise residuals computed by convolution. Our goal is to show the potential to compactify and further optimize existing features, such as the projection spatial rich model (PSRM). Motivation In steganography, the main goal is to communicate secretly through an overt channel. Steganalysis, on the other hand tries to detect the presence of steganography. For a long time, detection relied on classifiers trained in a supervised fashion on examples of cover and stego images [3]. To detect content-adaptive schemes, researchers hand crafted various high-dimensional feature representations (rich models). Such features are typically formed from noise residuals extracted from the input image by convolutions with high-pass linear filters (kernels). The array of residuals would then be represented either with co-occurrence matrices (empirical joint densities) [19, 12, 2] or via histograms of projections on random directions [6]. Due to the inherent complexity of digital images, the design of suitable kernels has been based entirely on heuristics. While there were attempts to optimize the kernels by parametrizing them and determining the parameters by minimizing the classifier detection error using, e.g., the Nelder–Mead algorithm [5], the complexity of evaluating the objective function (training a classifier on thousands of images and evaluating its performance using, e.g., the minimal total error probability PE) makes such approaches non-scalable and unable to optimize a sufficient number of kernels to build a rich model. The main flaw is the need for a constant feedback from the classifier, which is very time consuming due to the complexity of training with highdimensional features on thousands of images. The main advantage of Convolutional Neural Networks (CNNs) is their ability to optimize the feature extraction and classification steps simultaneously and thus close the loop between feature extraction and classification. Such networks are not only capable of learning the best decision boundary between different classes but also the best representation of each data class that would improve their separability. Originally developed for numerous computer vision problems, these networks were recently adapted for steganalysis [16, 14, 13, 17, 18]. Since in a typical computer vision problem, CNNs are used to learn patterns or objects, in steganalysis the signal of interest is hidden within the noise component of the image. In order to deal with this issue, in one of the early works in this direction Qian et al. [14] proposed to adjust the network design by using one of the successfully hand-designed high-pass kernels in the Spatial Rich Model (SRM) [3] as the first (fixed) convolutional layer in the network. The main reason behind this step is to suppress most of each image content and thus force the network to “pay attention” to high-frequency details. A closer look at the CNN structure reveals a similarity with feature-based steganalysis. The convolutional layers play the role of residual extraction while activation and pooling layers mimick truncation and quantization. Various researches tried to improve the performance of CNNs for steganalysis using insights acquired from feature-based steganalysis. In one of the most recent works in this direction, Xu et al. [17] proposed a novel CNN architecture capable of approaching the performance of feature-based steganalysis. In their five-layer CNN, after initial high-pass filtering and the first convolutional layer, they proposed to take the absolute value (ABS) of the feature maps (residuals) as the activation function followed by a hyperbolic tangent (TanH) in order to preserve the sign symmetry of the residuals in a process similar to computing SRM residuals. Additionally, by using 1×1 convolutional kernels in the last three layers, the authors forced the network to collect the local statistics from the feature maps in a pixel by pixel fashion. Inspired by similarities between conventional featurebased and CNN detectors, in this paper we investigate the possibility to use existing infrastructure behind CNNs to optimize the design of kernels (linear pixel predictors) in feature-based steganalysis. The replacement of the objective function in the form of some scalar classifier performance criterion by the loss function in a CNN, powerful gradient descend algorithms could be used for the optimization task provided there was a way to form histograms within a CNN. Thus, as our first step, we implemented a histogram layer for the Caffe CNN package. We use meanshifted Gaussian functions as the building blocks of this layer to obtain a proper back-flow of gradients through the layer and to facilitate learning of the parameters behind this layer. As a proof of concept, in this paper we use this layer to model individual submodels of the PSRM [6]. Instead of forming co-occurrences of neighboring quantized residual samples, PSRM projects unquantized residual values on random directions, which are subsequently quantized and represented using histograms as steganalysis features. In PSRM, each SRM kernel is projected on 55 random two-dimensional kernels and their rotated and mirrored versions. The higher detection rate of this feature set is the result of a large number of projections, which comes at a high computational cost. This problem can render this powerful feature set unusable for applications with limited time and computational power [10]. Modeling these submodels within the CNN framework with the histogram layer enables us to reduce the high dimensionality of this feature set by replacing random kernels with fewer optimized kernels. Our study also hints at the possibility to extract more information in the final layers of CNNs to pave the way for better network designs. In the next section, we introduce the histogram layer and discuss its design and internal building blocks. In Section “Experimental setup,” we simulate PSRM models within the CNN framework using the histogram layer and outline the training procedure. The proposed histogram layer is tested in Section “Analyzing results,” where we compare the detection results of our CNN model with PSRM submodels and discuss the results. The paper is summarized in the last section. Histogram layer To capture the local statistics of feature maps using histograms without any information loss we need to use step functions centered on each histogram bin. While shifted step functions are the best choice to compute independent histogram bins, they are unusable within a CNN framework because their derivative is zero everywhere except for the edges. This prevents the back flow of gradients through the layer and the back-propagation algorithm stops working. In contrast, a Gaussian kernel seems to be a good candidate to form histogram bins. Unlike the sharp edges of a step function, its smooth slopes will create a fine path for gradients to flow backwards through each histogram bin towards previous layers. A Gaussian activation function has the form g(x) = e− (x−μ)2 σ2 (1) where μ is the center of the histogram bin and σ controls the tradeoff between the accuracy of each binning operation (and the overlap between adjacent bins) and the flow of gradients through the layer. In our experiments, we fixed σ = 0.6 to obtain a close match between the exact histogram and a histogram computed using a Gaussian activation function. Fig. 1 shows the structure of an 8-bin histogram function using Gaussian activations. The value of μ for each kernel is chosen from the set μ∈ {−3.5,−2.5, . . . ,2.5,3.5}. The tails of the Gaussian kernels on the sides are replaced with a constant value of 1 in order to simulate the residual truncation done in featurebased steganalysis. To compute the value of the histogram bin B(k) centered at μ= μk for an M ×N activation map −3.5 −2.5 −1.5 −0.5 0.5 1.5 2.5 3.5 0 0.2 0.4 0.6 0.8 1 Figure 1. Structure of an 8-bin histogram using mean-shifted Gaussian
electronic imaging | 2016
Vahid Sedighi; Jessica J. Fridrich; Rémi Cogranne
Steganographic schemes for digital images are routinely designed and benchmarked based on feedback obtained on the standard image set called BOSSbase 1.01. While standardized image sets are important for advancing the field, relying on results from a single source may not provide fair benchmarking and may even lead to designs that are overoptimized and highly suboptimal on other image sources. In this paper, we investigate four modern steganographic schemes for the spatial domain, WOW, SUNIWARD, HILL, and MiPOD on two more versions of BOSSbase. We observed that with their default settings, the mutual ranking and detectability of all four embedding algorithms can dramatically change across the three image sources. For example, in a version of BOSSbase whose images were cropped instead of resized, all four schemes exhibit almost the same empirical security when steganalyzed with the spatial rich model (SRM). On the other hand, in decompressed JPEG images, WOW is the most secure embedding algorithm out of the four, and this stays true irrespectively of the JPEG quality factor when steganalyzing with both SRM and maxSRM. The empirical security of all four schemes can be increased by optimizing the parameters for each source. This is especially true for decompressed JPEGs. However, the ranking of stego schemes still varies depending on the source. Through this work, we strive to make the community aware of the fact that empirical security of steganographic algorithms is not absolute but needs to be considered within a given environment, which includes the cover source. Motivation Currently, steganographic schemes are often developed and benchmarked on standard image sources. By far the most frequently used database is BOSSbase 1.01 [1], which contains 10,000 images taken in the RAW format by seven different cameras, converted to grayscale, downsampled using the Lanczos resampling algorithm with antialiasing turned OFF, and cropped to the final size of 512×512 pixels. Many articles have been published in which this database was the sole source on which steganographers fine-tuned their embedding scheme to obtain the best possible empirical security. However, BOSSbase images are far from what many would consider natural – they are essentially grayscale thumbnails obtained by a script that only a handful of people use. Because of the rather aggressive downsizing of the original full-resolution RAW files, the content of many BOSSbase images is very complex with apparently rather weak dependencies among neighboring pixel values. The downsizing also effectively suppresses color interpolation artifacts and introduces artifacts of its own. There are images in BOSSbase that are very smooth, e.g., improperly focused images as well as images that are very dark and contain almost no content, such as an image of the Moon. One may thus argue that BOSSbase contains “enough” diversity to be used as a standardized source. On the other hand, virtually all steganographic schemes contain free parameters or design elements, such as an image transform and filter kernels, that are selected based on feedback provided by detectors on BOSSbase. We show that this makes the design overoptimized to a given image source and the embedding suboptimal on different sources. Even after optimizing the parameters of each embedding scheme to the source, universal benchmarking still does not seem possible since the optimized schemes exhibit different empirical security across sources. Additionally, the recently proposed synchronization of embedding changes [4, 12] appears far less effective on images with suppressed noise. In the next section, we explain the measure of empirical security used in this paper and how it is evaluated. We also describe three versions of BOSSbase that will be investigated, the steganographic algorithms and steganalysis feature sets, as well as the choice of the classifier. In the third section, we start with comparing the empirical security of all algorithms on all three image sources and with two different steganalysis feature sets. Then, in the fourth section we identify the key parameters of each embedding scheme and perform a grid search to find the setting that maximizes the empirical security. The fifth section is devoted to investigating the impact of synchronizing the selection channel in different sources. The paper is concluded in the last section, where we summarize the most important lessons learned. Setup of experiments Security of embedding algorithms will be evaluated experimentally by training a binary classifier for the class of cover images and a class of stego images embedded with a fixed relative payload in bits per pixel (bpp), the so-called payload-limited sender. The classifier is the FLD ensemble [10] with two feature representations – the Spatial Rich Model (SRM) [7] and its selection-channel-aware version maxSRMd2 [5]. The security is reported with PE, which is the minimal total error probability under equal priors PE = 1 2(PFA +PMD) (1) obtained on the testing set averaged over ten 50/50 splits of the image source into training and testing sets. Other measures were proposed in the past, such as the false-alarmrate for 50% correct detection of stego images [13], FA50, which is more telling about the algorithm security for low false alarms. It has been observed that for the payloadlimited sender, the detection statistic that is thresholded in the linearized version of the ensemble classifier [3] when rich models are applied is approximately Gaussian. In this case, both quantities, PE or FA50, would provide the same ranking of stego systems because there is a strictly monotone relationship between them. For the purpose of this paper, we created the following two new versions of BOSSbase 1.01: 1. BOSSbaseC (C as in Cropped) was obtained using the same script as BOSSbase 1.01 but with the resizing step skipped. The images were centrally cropped to 512×512 pixels right after they were converted from the RAW format to grayscale. Images from this source are less textured but do contain acquisition noise. 2. BOSSbaseJQF (J as in JPEG, QF is the JPEG quality factor) was formed from BOSSbase 1.01 images by JPEG compressing them with quality factor QF∈ {75,85,95} and then decompressing to the spatial domain and representing the resulting image as an 8-bit grayscale. The low-pass character of JPEG compression makes the images less textured and much less noisy. Figure 1 shows examples of four images from each source. Notice that images from BOSSbaseC appear “zoomed-in” because of the absence of downsizing. Four embedding algorithms will be investigated in this paper: Wavelet Obtained Weights (WOW) [8], the Spatial version of the UNIversal WAvelet Relative Distortion (S-UNIWARD) [9], High-Low-Low (HILL) [11], and Minimizing the power of the most POwerful Detector (MiPOD) [14], which coincides with the MultiVariate Gaussian (MVG) steganography with a Gaussian residual model [15]. The study is limited to the spatial domain and does not consider JPEG images because the source generally does not play a significant role in JPEG steganography due to the low-pass character of JPEG compression, which tends to even out the differences between various sources. Empirical security across sources The purpose of the first experiment is to show that the ranking of steganographic schemes as originally described in the corresponding papers heavily depends on the image source. Figure 2 shows PE as a function of the relative payload in bits per pixel (bpp) for the four embedding algorithms listed in the previous section on BOSSbase 1.01, (first row), BOSSbaseC (second row), and BOSSbaseJ85 (third row) with SRM (left) and maxSRMd2 (right). Note that the ranking as well as the differences between individual embedding algorithms heavily varies depends on the cover source. Most notably, in BOSSbaseJ85, the most secure algorithm is WOW while MiPOD is the least secure, which is the exact opposite in comparison with BOSSbase 1.01. Moreover, when detecting with the SRM all four embedding schemes on BOSSbaseC have nearly identical empirical security. Optimizing steganography for each source In this section, we investigate how much the empirical security of each algorithm can be improved by adjusting the embedding parameters. This gain is quantified and the optimized embedding algorithms are ranked again for each image source. We start by describing the parameters with respect to which each embedding scheme was optimized. The description is kept short but, hopefully, detailed enough for a reader familiar with the embedding algorithms to understand the parameters’ role. The reader is referred to the corresponding publications for more details. WOW: This embedding algorithm was designed to prefer making embedding changes at pixels in textured areas defined as regions with an “edge” in the horizontal, vertical, and both diagonal directions. The embedding begins with extracting directional residuals using tensor products of 8-tap Daubechies filters. Three directional filters with 8×8 kernels denoted K(h), K(v), and K(d) are used to extract three directional residuals: R(h) = K(h) ?X, R(v) = K(v) ?X, and R(d) = K(d) ?X, where ′?′ denotes a convolution and X is the matrix of pixel grayscales. In the next step, the so-called embedding suitabilities are computed: ξ(k) = |R(k)|? |K(k)|, k ∈ {h,v,d}. The embedding cost of changing pixel i, j by +1 or −1 is obtained using the reciprocal Hölder norm ρ ij = ( |ξ ij | p + |ξ ij | p + |ξ ij | p )−p with p=−1. To optimize WOW for different image sources, we search for the number of taps in Daubechies filters, p1 ∈ {2,4,8,16} and the power of the Hölder norm p2 = p. S-UNIWARD: The pixel embedding costs are obtained from a distortion function defined as the sum of relative absolute differences between wavelet coefficients of cover and stego images. Only the highest frequency band of wavelet coefficients is use
international conference on image processing | 2016
Vahid Sedighi; Jessica J. Fridrich
When hiding messages in digital images, care needs to be exercised how the embedding changes are executed in or near saturated pixels. In this paper, we consider three different rules that are currently being used that adjust the embedding in saturated pixels and assess their impact on empirical steganographic security of four modern embedding algorithms. Surprisingly, the rules can have a major effect, especially in image sources with stronger noise. We show that the preferred way to treat saturated patches during message hiding is to adjust the pixel costs to entirely avoid making embedding changes in saturated pixels despite the ensuing loss of embedding capacity. This paper hopes to raise the awareness of the importance of treatment of saturated pixels in steganography to avoid introducing easily correctable flaws that may negatively affect security.
international conference on acoustics, speech, and signal processing | 2017
Rémi Cogranne; Vahid Sedighi; Jessica J. Fridrich
This paper investigates practical strategies for distributing payload across images with content-adaptive steganography and for pooling outputs of a single-image detector for steganalysis. Adopting a statistical model for the detectors output, the steganographer minimizes the power of the most powerful detector of an omniscient Warden, while the Warden, informed by the payload spreading strategy, detects with the likelihood ratio test in the form of a matched filter. Experimental results with state-of-the-art content-adaptive additive embedding schemes and rich models are included to show the relevance of the results.
information hiding | 2015
Vahid Sedighi; Jessica J. Fridrich
It has recently been shown that steganalysis of content-adaptive steganography can be improved when the Warden incorporates in her detector the knowledge of the selection channel -- the probabilities with which the individual cover elements were modified during embedding. Such attacks implicitly assume that the Warden knows at least approximately the payload size. In this paper, we study the loss of detection accuracy when the Warden uses a selection channel that was imprecisely determined either due to lack of information or the stego changes themselves. The loss is investigated for two types of qualitatively different detectors -- binary classifiers equipped with selection-channel-aware rich models and optimal detectors derived using the theory of hypothesis testing from a cover model. Two different embedding paradigms are addressed -- steganography based on minimizing distortion and embedding that minimizes the detectability of an optimal detector within a chosen cover model. Remarkably, the experimental and theoretical evidence are qualitatively in agreement across different embedding methods, and both point out that inaccuracies in the selection channel do not have a strong effect on steganalysis detection errors. It pays off to use imprecise selection channel rather than none. Our findings validate the use of selection-channel-aware detectors in practice.