A short letter on the dot product between rotated Fourier transforms
AA short letter on the dot productbetween rotated Fourier transforms
Aaron R. VoelkerApplied Brain Research Inc.Technical ReportJuly 28, 2020
Abstract
Spatial Semantic Pointers (SSPs) have recently emerged as a powerful tool for represent-ing and transforming continuous space, with numerous applications to cognitive modelling anddeep learning. Fundamental to SSPs is the notion of “similarity” between vectors representingdifferent points in n -dimensional space – typically the dot product or cosine similarity betweenvectors with rotated unit-length complex coefficients in the Fourier domain. The similaritymeasure has previously been conjectured to be a Gaussian function of Euclidean distance.Contrary to this conjecture, we derive a simple trigonometric formula relating spatial dis-placement to similarity, and prove that, in the case where the Fourier coefficients are uniformi.i.d., the expected similarity is a product of normalized sinc functions: (cid:81) nk =1 sinc ( a k ) , where a ∈ R n is the spatial displacement between the two n -dimensional points. This establishesa direct link between space and the similarity of SSPs, which in turn helps bolster a usefulmathematical framework for architecting neural networks that manipulate spatial structures. Scalar Analysis
Let
F {·} denote the discrete Fourier transform, and let X ∈ R d be a vector such that all of thecomplex coefficients in F { X } are unit-length – also known as a “unitary” Semantic Pointer (SP;Plate, 1995; Gosmann, 2018). Such vectors are fully determined by their polar angles in the Fourierdomain, θ ∈ R d , i.e., the parameters: θ = Imag [ln
F { X } ] , | θ | < π . (1)Given any x ∈ R , we then use the following definition to encode x into a high-dimensional vector: X x DEF = F − (cid:8) e iθx (cid:9) , (2)which combines the Spatial Semantic Pointer (SSP) “fractional binding” definition from Komer et al.(2019) with Euler’s formula. Essentially, X x encodes a real-valued scalar quantity ( x ) as a high-dimensional unit-length vector that may be convolved with other vectors in semantically meaningfulways, thus enabling the manipulation of topological structures within neural networks (Komer andEliasmith, 2020; Dumont and Eliasmith, 2020).Now, consider two scalar SSPs displaced by a ∈ R , as in: A = X x , B = X x + a . (3)Our goal is to characterize A T B , i.e., the dot product between A and B . Since both vectors areunitary, and the Fourier transform is unitary (i.e., preserves the dot product, up to a constantrescaling by d ) and Hermitian, we can assert the following string of equalities: dA T B = F { A } T F { B } = d (cid:88) j =1 e iθ j x − iθ j ( x + a ) = d (cid:88) j =1 Real (cid:2) e iθ j a (cid:3) = d (cid:88) j =1 cos ( θ j a ) . (4)1 a r X i v : . [ q - b i o . N C ] J u l igure 1: Plot of sinc ( a ) = sin ( πa ) / ( πa ) , relating the displacement ( x -axis) to the expectedsimilarity ( y -axis) between two unitary SPs. The similarity is symmetric about a = 0 .Thus, we obtain the following trigonometric formula relating the cosine similarity to the displace-ment a , in terms of the polar angles of X : A T B = 1 d d (cid:88) j =1 cos ( θ j a ) . (5)That is, the similarity is equal to the real-valued mean across the complex numbers that aredetermined by scaling each polar angle ( θ ) by the displacement ( a ). To turn this formula into something more concrete, we must assume something about θ . Avery natural assumption is that θ j ∼ U ( − π, π ) are independent and identically distributed (i.i.d.),although we note this is not the case for SSP encodings that use hexagonal lattices or other regulargrids (Dumont and Eliasmith, 2020; Komer and Eliasmith, 2020). Focusing on the uniform case,we apply the law of the unconscious statistician to derive the expected similarity: E θ (cid:2) A T B (cid:3) = 1 d d (cid:88) j =1 π (cid:90) π − π cos ( θa ) dθ = 12 π (cid:90) π − π cos ( θa ) dθ = sin ( πa ) / ( πa ) DEF = sinc ( a ) .Here, sinc ( · ) is defined to be the normalized sinc function – plotted in Figure 1 for reference. Higher-Dimensional Spaces
To generalize this to SSPs representing n -dimensional space (e.g., n = 2 in Komer et al. (2019)), werepeat the above recipe, where instead X ∈ R n,d and Θ ∈ R n,d are matrices, such that equations 1and 2 hold for each row of X and Θ . Now, with x , a ∈ R n , equation 3 becomes: A = n (cid:126) k =1 X kx k , B = n (cid:126) k =1 X kx k + a k . (6)Redoing equations 4 and 5 yields: A T B = 1 d d (cid:88) j =1 Real (cid:34) exp (cid:40) i n (cid:88) k =1 Θ k,j a k (cid:41)(cid:35) = 1 d d (cid:88) j =1 cos (cid:32) n (cid:88) k =1 Θ k,j a k (cid:33) . (7)Finally, for i.i.d. uniform Θ k,j , we obtain the following concrete equation for expected similarity: E Θ (cid:2) A T B (cid:3) = 1(2 π ) n (cid:90) π − π · · · (cid:90) π − π (cid:124) (cid:123)(cid:122) (cid:125) n integrals cos (cid:32) n (cid:88) k =1 θ k a k (cid:33) dθ · · · dθ n = n (cid:89) k =1 sin ( πa k ) / ( πa k ) = n (cid:89) k =1 sinc ( a k ) . The imaginary components cancel out since X is real, by the Hermitian symmetry of the Fourier transform. n = 2 , d = 1024 ). Two unitary vectors ( X ) arerandomly generated with uniformly distributed polar angles ( Θ ), and the similarity is evaluatedacross a square grid of displacements, a ∈ [ − , . For each displacement, we plot the Euclideandistance ( (cid:107) a (cid:107) ) against the actual similarity ( A T B ) in blue (empirical) as well as the expectedsimilarity ( (cid:81) nk =1 sinc ( a k ) = (cid:81) nk =1 sin ( πa k ) / ( πa k ) ) in orange (analytical).Figure 3: Surface plot of (cid:81) nk =1 sinc ( a k ) = (cid:81) nk =1 sin ( πa k ) / ( πa k ) for a ∈ [ − , , n = 2 , modellingthe representation of an SSP encoding a two-dimensional point in space.3 cknowledgements I’d like to acknowledge Ben Morcos for motivating this investigation with his questions aboutthe observed complex structure in SSP similarity and providing the empirical sample shown inFigure 2, Brent Komer for communications surrounding the distribution of SSP similarity givendifferent choices of X vectors, and Chris Eliasmith for pointing out the connection to the sinc function and for providing feedback. This work was done in connection with research funded bythe Laboratory for Physical Sciences (LPS). References
Nicole S-Y. Dumont and Chris Eliasmith. Accurate representation for spatial cognition using gridcells. In
Proceedings of the 42nd Annual Meeting of the Cognitive Science Society. , 2020.Jan Gosmann.
An Integrated Model of Context, Short-Term, and Long-Term Memory . Phd thesis,University of Waterloo, 2018. URL http://hdl.handle.net/10012/13498 .Brent Komer and Chris Eliasmith. Efficient navigation using a scalable, biologically inspired spatialrepresentation. In
Proceedings of the 42nd Annual Meeting of the Cognitive Science Society. ,2020.Brent Komer, Terrence C. Stewart, Aaron R. Voelker, and Chris Eliasmith. A neural representationof continuous space using fractional binding. In , Montreal, QC, 2019. Cognitive Science Society.Tony A Plate. Holographic reduced representations.