[PDF] A multi-dimensional stream and its signature representation

Abstract

The signature of a path is an essential object in the theory of rough paths. The signature representation of the data stream can recover standard statistics, e.g. the moments of the data stream. The classification of random walks indicates the advantages of using the signature of a stream as the feature set for machine learning.

Full PDF

AA multi-dimensional stream and its signature representation

Hao NiNovember 8, 2018

Abstract

The signature of a path is an essential object in the theory of rough paths.The signature representation of the data stream can recover standard statistics, e.g.the moments of the data stream. The classiﬁcation of random walks indicates theadvantages of using the signature of a stream as the feature set for machine learning.

This short paper is devoted to show that the signature of the lead-lag transformationis a useful way to encode a multi-dimensional unstructured data stream. We aim todemonstrate the following points:1. The signature of a discrete sample stream is a rich statistics and encodes theessential information of data stream;2. The truncated signature of a discrete sample stream provides a summary interms of the eﬀect of this stream and it leads to dimension reduction for thisoriginal stream;3. The signature of a discrete sample can be used for parameter inference andprediction.The main result is Theorem 4.1, which states that no matter how frequently thepath is sampled, the p th moment of the increment process is a linear functional onthe truncated signature up to degree p . Let us start with introducing the tensor algebra space, in which the signature of apath takes value.

Deﬁnition 2.1 (Tensor algebra space)

A formal E -tensor series is a sequenceof tensors ( a n ∈ E ⊗ n ) n ∈ N which we write a = ( a , a , . . . ) . There are two binaryoperations on E -tensor series, an addition + and a product ⊗ , which are deﬁned asfollows. Let a = ( a , a , ... ) and b = ( b , b , ... ) be two E -tensor series. Then wedeﬁne a + b = ( a + b , a + b , ... ) , (1) a r X i v : . [ s t a t . O T ] S e p nd a ⊗ b = ( c , c , ... ) , (2) where for each n ≥ , c n = n (cid:88) k =0 a k ⊗ b n − k . (3) The product a ⊗ b is also denoted by ab . We use the notation for the se-ries (1 , , ... ) , and for the series (0 , , ... ) . If λ ∈ R , then we deﬁne λ a to be ( λa , λa , ... ) . Deﬁnition 2.2

The space T (( E )) is deﬁned to be the vector space of all formal E -tensors series. Similar to the real valued case, we can deﬁne the exp mapping on T (( E )) as follows. Deﬁnition 2.3

Let a be arbitrary element of T (( E )) . Then exp( a ) is the elementof T (( E )) by exp( a ) := (cid:88) n =0 a ⊗ n n ! . Now we are in a position to give the deﬁnition of the signature of a path of boundedvariation (ﬁnite length).

Deﬁnition 2.4 (Signature of a path)

Let J be a compact interval and X be acontinuous function of ﬁnite length, which maps J to E . The signature S ( X ) of X over the time interval J is an element (1 , X , ..., X n , ... ) of T (( E )) deﬁned for each n ≥ as follows X n = (cid:90) · · · (cid:90) u <...

Suppose that { e i } di =1 be a basis of E , and thus for every n ≥ , { e i ⊗· · ·⊗ e i n } i ,...,i n ∈{ ,...,d } forms a basis of E ⊗ n . Therefore S ( X ) can be rewrittenas follows: S ( X ) = 1 + ∞ (cid:88) n =1 (cid:88) i ,...,i n ∈{ ,...,d }  (cid:90) · · · (cid:90) u <...

Let X : [0 , s ] −→ E and Y : [ s, t ] −→ E be two continuous paths.Their concatenation is the path X ∗ Y deﬁned by ( X ∗ Y ) u = (cid:26) X u , u ∈ [0 , s ] ; X s + Y u − Y s , u ∈ [ s, t ] , where ≤ s ≤ t . Theorem 2.1 (Chen’s identity)

Let X : [0 , s ] −→ E and Y : [ s, t ] −→ E be twocontinuous paths with ﬁnite -variation. Then S ( X ∗ Y ) = S ( X ) ⊗ S ( Y ) , (4) where ≤ s ≤ t . The proof can be found in [2].Let { e ∗ i } di =1 be a basis of the dual space E ∗ . Then for every n ∈ N , { e ∗ i ⊗· · ·⊗ e ∗ i n } itcan be naturally extended to ( E ∗ ) ⊗ n by identifying the basis (cid:0) e I = e ∗ i ⊗ · · · ⊗ e ∗ i n (cid:1) as (cid:104) e ∗ i ⊗ · · · ⊗ e ∗ i n , e j ⊗ · · · ⊗ j i n (cid:105) = δ i ,j . . . δ i n ,j n . The linear action of ( E ∗ ) ⊗ n on E ⊗ n extends naturally to a linear mapping ( E ∗ ) ⊗ n → T (( E )) ∗ deﬁned by e I ( a ) = e ∗ I ( a n ) , where I = ( i , . . . , i n ).Hence the linear forms e ∗ I , as I span the set of ﬁnite words in the letters 1 , . . . , d forma basis of T ( E ∗ ). Let T (( E )) ∗ denote the space of linear forms on T (( E )) inducedby T ( E ∗ ). Let us consider a word I = ( i , . . . , i n ), where i , . . . , i n ∈ { , . . . , d } .Deﬁne π I as e ∗ I restricting the domain to the range of the signatures, denoted by S ( V [0 , T ] , E ), in formula π I ( S ( X )) = e ∗ I ( S ( X )) , where X is any E -valued continuous path of bounded variation.For any two words I and J , the pointwise product of two linear forms π I and π J as real valued functions is a quadratic form on S ( V [0 , T ] , E ), but it is remarkablethat it is still a linear form, which is stated in Theorem 2.2. Let us introduce thedeﬁnition of the shuﬄe product. Deﬁnition 2.6

We deﬁne the set S m,n of ( m, n ) shuﬄes to be the subset of permu-tation in the symmetric group S m + n deﬁned by S m,n = { σ ∈ S m + n : σ (1) < · · · < σ ( m ) , σ ( m + 1) < · · · < σ ( m + n ) } . eﬁnition 2.7 The shuﬄe product of π I and π J denoted by π I (cid:1) π J deﬁned asfollows: π I (cid:1) π J = (cid:88) σ ∈ S m,n π ( k σ − ,...,k σ − m + n ) ) , where I = ( i , i , · · · , i n ) , J = ( j , j , · · · , j m ) and ( k , . . . , k m + n ) = ( i , · · · , i n , j , · · · , j m ) . Theorem 2.2 (Shuﬄe Product Property)

Let X be a path of bounded varia-tion. Let I and J be two arbitrary indices. The following identity holds: π I ( S ( X )) π J ( S ( X )) = ( π I (cid:1) π J )( S ( X )) . In the following we constrain our discussion on paths observed at a ﬁnite number oftime stamps and take value in E := R d . Let { x n } Ln =1 be an increment process, where x n ∈ E . (You can think of it as areturn process.) Let X := { X n } Ln =0 denote the corresponding partial sum processof { x n } L − n =0 . (It can be thought as a price process.) Mathematically, X is deﬁned asfollows: X = 0; X n +1 = n (cid:88) i =1 x i , if n = 1 , . . . , L. Now let us introduce the lead-lag transformation associated with a d -dimensionalstream X ([1]). Deﬁnition 3.1 (Lead-Lag Transformation)

Let X := { X n } Ln =0 be a d -dimensionaldiscrete sampled path. The lead-lag transformation associated with X is a d -dimensional path which is obtained by linear interpolation of X := { X n } Ln =0 , where X ( i )0 = X ( i )0 and X ( i )2 n − = X ( i ) n and for every n ∈ { , . . . , L − } and for every i ∈ { , . . . , d } , X ( i )2 n +2 = X ( i )2 n +1 = X ( i ) n +1 X ( i + d )2 n = X ( i + d )2 n +1 = X ( i ) n . Let L denote the lead-lag transformation operator. The lead-lag process X is in the form of the following: , X , X , ... X n − , X n . || || || || ||  X (1)0 X (2)0 ... X ( d )0 X (1)0 X (2)0 ... X ( d )0  ,  X (1)1 X (2)1 ... X ( d )1 X (1)0 X (2)0 ... X ( d )0  ,  X (1)1 X (2)1 ... X ( d )1 X (1)1 X (2)1 ... X ( d )1  · · ·  X (1) n X (2) n ... X ( d ) n X (1) n − X (2) n − ... X ( d ) n −  ,  X (1) n X (2) n ... X ( d ) n X (1) n X (2) n ... X ( d ) n  Lemma 3.1 (The multiplicative of the lead-lag transformation)

For any twodiscrete sampled path X = { X n } L n =0 and Y = { Y n } L n =0 L ( X ∗ Y ) = L ( X ) ∗ L ( Y ) , where X ∗ Y denote the concatenation of two discrete sampled path, i.e. ( X ∗ Y ) n = (cid:40) X n if n ≤ L − X L − Y + Y n − L if L ≤ n ≤ L + L . Let us deﬁne the signature of the discrete sampled stream, and discuss the relevantproperties.

Deﬁnition 3.2 (The signature representation of a discrete sampled stream)

Let X be a discrete sampled path in E and X is the lead-lag transformation of X .The signature of X is deﬁned to be the signature of X , denoted by S ( X ) . Let S d ( X ) denote the truncated signature of X up to degree d . Let DS denote the range ofsignatures of the lead-lag transformation of discrete sampled paths in E . Lemma 3.2 (Chen’s Identity for Discrete Sampled Path)

For any two dis-crete sampled path X = { X n } L n =0 and Y = { Y n } L n =0 . S ( L ( X ∗ Y )) = S ( L ( X )) ⊗ S ( L ( Y )) . Deﬁnition 3.3 (Additive functional on DS ) Let K be a linear form on T (( E )) .We say that K is additive in DS if and only if for every S ( X ) , S ( Y ) ∈ DS , it followsthat K ( S ( X ∗ Y )) = K ( S ( X )) + K ( S ( Y )) . For convenience, let us adopt the following notation

Deﬁnition 3.4

Fix any positive integer p . Let K ( p ) I denote the set of the linearforms on T (( E )) such that it can be written as (cid:88) | J | = p,J =( J ,I ) C J π ( J ) here C J are all constants and the summation is taken over all J such that J is oflength p and ended in the substring I . Let us focus on one dimensional case, and we will show that the signature of X con-tains rich information of the path X and it is a good basis function to represent thestandard statistic, for example, the empirical moments of increments of X (Theorem4.1). Let us start with discussion on properties of the signature of X .By Chen’s identity and simple calculation, the signature of a path in DS can begiven so explicit as follows: Lemma 4.1 (Signature of one-dimensional discrete path)

For any X ∈ DS ,and { x i } Li =1 is the increment process associated with X , then S ( X ) = L (cid:79) i =1 exp( x i e ) ⊗ exp( x i e ) Lemma 4.2

For every index I ending in and any positive integer p , there exists K ∈ K ( | I | + p )2 , for any X L ∈ DS , such that π ( I,M p ) ( S ( X L ) = K ( S ( X L )) . where M p is p copies of .For every index I ending in and any positive integer p , there exists K ∈ K ( | I | + p )1 ,for any X L ∈ DS , such that π ( I,K p ) ( S ( X L ) = K ( S ( X L )) . where M p = (1 , . . . , , i.e. p copies of . Proof.

First of all, let us prove that the case p = 1. As I ends in 2, then we canrewrite I as ( J, π (1) − π (2) )( S ( X )) = 0, then0 = π I ( π (1) − π (2) ) = π ( J (cid:1) , + π ( I , − π I (cid:1) π (2) .π ( I, = π I (cid:1) π (2) − π ( J (cid:1) , ∈ K | I | + p . Then we prove this statement by induction on p . Let K p be p copies of 2 s .0 = π I ( π ( M p ) − π ( K p ) ) = π ( J (cid:1) M p , + π ( I (cid:1) M p − , − π I (cid:1) π K p . Let us investigate the term π ( I (cid:1) M p − , .( I (cid:1) M p − ,

1) = (

I, M p ) + p − (cid:88) k =1 ( J (cid:1) M k , , M p − k ) , and thus π ( I (cid:1) M p − , = π ( I,M p ) + p − (cid:88) k =1 π ( J (cid:1) M k , ,M p − k ) . or any k = 1 , . . . , p −

1, by induction hypothesis, there exist the linear functional G ∈ K | I | + p such that for any S ( X ) ∈ DS , π ( J (cid:1) M k , ,M p − k ) S ( X ) = G ( S ( X )) . Therefore π ( I,M p ) = π I (cid:1) π K p − π ( J (cid:1) M p , − p − (cid:88) k =1 π ( J (cid:1) M k , ,M p − k ) , = π I (cid:1) π K p − π ( J (cid:1) M p , − G ∈ K | I | + p . Now we complete the ﬁrst part of the statement. We can use the same strategy toshow he second part of the statement.

Remark 4.1

Since π M p = π K p , Lemma 4.2 shows that for each index I , π ( I ) canbe rewritten as a linear functional in K | I | . Lemma 4.3

For any index I = ( i , . . . , i n − , , and any S ( X L ) ∈ DS , π ( I, ( S ( X L )) = L (cid:88) j =1 π I ( S ( X j − )) x j . (5) Proof.

We show this lemma by induction on L . For L = 1, both sides of 5 areequal to 0. By Chen’s identity, for L ≥

1, it follows that π ( I, ( S ( X L )) = π ( I, ( S ( X L − ) ⊗ S ( X L − ,L ))= π ( I, ( S ( X L − )) + π ( I ) ( S ( X L − )) x L because S ( X L − ,L ) = exp( x L e ) ⊗ exp( x L e )Then it follows by the induction hypothesis that π ( I, ( S ( X L )) = L − (cid:88) j =1 π I ( S ( X j − )) x j + π ( I ) ( S ( X L − )) x L = L (cid:88) j =1 π I ( S ( X j − )) x j . Lemma 4.4

For any index I = ( i , . . . , i n − , and k ≥ there exists a linearfunctional F depending only on I and k , and F ∈ K n + k such that for any S ( X L ) ∈DS , it holds that F ( S ( X L )) = L (cid:88) j =1 π I ( S ( X j − )) x kj . (6) roof. For k = 1, it is proved in Lemma 4.3. Assume that k ≤ K − k = K . π ( I , ,..., ( S ( X L )) − π ( I , ,..., ( S ( X L − ))= k (cid:88) j =1 π ( I , ∗ j ) ( S ( X L − )) x k − jL ( k − j )! . After rearranging the above formula we have that π ( I ) ( S ( X L − )) x kL = k !  π ( I , ,..., ( S ( X L )) − π ( I , ,..., ( S ( X L − )) + k − (cid:88) j =1 π ( I , ∗ j ) ( S ( X L − )) x k − jL ( k − j )!  . By telescope sum of the above equation, we have that L (cid:88) i =1 π ( I ) ( S ( X i − )) x ki = k ! π ( I , ,..., ( S ( X L )) + k ! L (cid:88) i =1  k − (cid:88) j =1 π ( I , ∗ j ) ( S ( X i − )) x k − ji ( k − j )!  = k ! π ( I , ,..., ( S ( X L )) + k !  k − (cid:88) j =1 k − j )! L (cid:88) i =1 π ( I , ∗ j ) ( S ( X i − )) x k − ji  By Lemma 4.2, there is a linear functional G depending on ( I , ∗ j ) and k − j , suchthat π ( I , ∗ j ) = G. Then by induction hypothesis, L (cid:88) i =1 π ( I , ∗ j ) ( S ( X i − )) x k − ji can be rewritten as a linear function on K n + k . π ( I , ,..., can be rewritten as alinear functional in K n + k , so is (cid:80) Li =1 π ( I ) ( S ( X i − )) x ki . Now the proof is complete. Lemma 4.5

Let L ∈ K ( p )1 and L is additive, then there exists ˜ L ∈ K ( p +2)1 , suchthat ˜ L ( S ( X n )) = − n (cid:88) i =1 L ( S ( X ) i ) x i Proof.

Let L := (cid:80) I C I π ( I ) . For n ≥

1, it holds that π ( I , , ( S ( X n ))= π ( I , , ( S ( X n − ) ⊗ S ( X n − ,n ))= π ( I , , ( S ( X n − )) + π ( I , ( S ( X n − )) x n . imilarly we have π ( I , , ( S ( X n )) = π ( I , , ( S ( X n − ) ⊗ S ( X n − ,n ))= π ( I , , ( S ( X n − )) + π ( I , , ( S ( X n − ,n ))+ π ( I , ( S ( X n − )) π (2) ( S ( X n − ,n )) + π ( I ) ( S ( X n − )) π (2 , ( S ( X n − ,n )) + R I ( S ( X n − ) ,x n ) = π ( I , , ( S ( X n − )) + π ( I ) ( S ( X n − ,n )) x n π ( I ) ( S ( X n − )) x n π ( I , ( S ( X n − )) x n + R I ( S ( X n − ) , x n ) . where R I ( S ( X n − ) , x n ) = (cid:88) J ∗ J = I ,J (cid:54) = ∅ ,J (cid:54) = ∅ π ( J ) S ( X n − ) π J ( S ( X n − ,n ))= (cid:88) J ∗ J = I ,J (cid:54) = ∅ ,J (cid:54) = ∅ π ( J ) S ( X n − ) c J x | J | n . The last equality comes from the fact that π ( I , , ( S ( X n − ,n )) = π ( I , , (exp( x n e ) ⊗ exp( x n e ))= π ( I ) (exp( x n e )) π (2 , (exp( x n e ))= π ( I ) ( S ( X n − ,n )) x n . By Lemma 4.4, there exists a linear functional G I on K p +21 such that G I ( S ( X n )) − G I ( S ( X n − )) = R I ( S ( X n − ) , x n ) . Thus it follows π ( I , , ( S ( X n )) − π ( I , , ( S ( X n ))= π ( I , , ( S ( X n − )) − π ( I , , ( S ( X n − )) − ( π ( I ) ( S ( X n − )) + π ( I ) ( S ( X n − ,n ))) x n G ( S ( X n )) − G ( S ( X n − )) . where G ( S ( X n )) = (cid:88) I C I G I ( S ( X n )) . Then following the notations˜ L = (cid:88) I C I (cid:16) π ( I , , − π ( I , , (cid:17) − Gf n = ˜ L ( S ( X n )) nd it is obviously hat f (0) = 0. Moreover since L is additive, then L ( S ( X n − )) + L ( S ( X n − ,n )) = L ( S ( X n )), and it follows f n = f n − − (cid:88) I C I ( π ( I ) ( S ( X n − )) + π ( I ) ( S ( X n − ,n ))) x n f n − − ( L S ( X n − ) + L S ( X n − ,n )) x n f n − − L S ( X n ) x n . By the telescoping sum o f n , it holds that f n = n (cid:88) i =1 ( f i − f i − ) + f = n (cid:88) i =1 L S ( X i ) x i . Theorem 4.1 (p-moment)

For any integer p > , there exist two linear function-als L (1) p ∈ K ( p )1 , and L (2) p ∈ K ( p )2 , such that for every path X , the following equationfollows: L (1) p ( S ( X )) = L (2) p ( S ( X )) = N (cid:88) i =1 x pi . (7) Obviously if (7) is true, then L (1) p and L (2) p are both additive. Proof.

Let’s prove it by induction on p . It is true for p = 1 ,

2. Suppose that itholds for p < P . Let us study the case when p = P . N (cid:88) i =1 x Pi = N (cid:88) i =1 (cid:16) L (1) p − ( S ( X ) i ) − L (2) p − ( S ( X ) i − ) (cid:17) x i = N (cid:88) i =1 L (1) p − ( S ( X ) i x i − N (cid:88) i =1 L (2) p − ( S ( X ) i − x i By Lemma 4.5, since L (1) p − is additive, then (cid:80) Ni =1 L (1) p − ( S ( X ) i − x i can be rewrittenas a linear functional G ∈ K (1) p such that G ( S ( X ) N ) = N (cid:88) i =1 L (1) p − ( S ( X ) i x i . By Lemma 4.4, it follows that there exists G ∈ K (2) p , such that G ( S ( X ) N ) = N (cid:88) i =1 L (2) p − ( S ( X ) i − x i . Multi-Dimensional Stream Case

The following lemma states that the empirical covariance of the increment of amulti-dimensional data stream can be fully characterized by its signatures.

Lemma 5.1

Let X := { X n } Ln =1 be a d -dimensional discretely sampled stream, { x n } Ln =1 be the associated increment process and X be the corresponding lead-lagprocess of X . For any i , i ∈ { , . . . , d } , there exists a linear functional L such that L (cid:88) n =1 x ( i ) n x ( i ) n = 2 (cid:16) π ( i ,i + d ) ( S ( X )) − π ( i ,i ) ( S ( X )) (cid:17) . Proof.

For the case that i = i = i , it holds that L (cid:88) n =1 ( x ( i ) n ) = π ( i,i + d ) ( S ( X )) − π ( i + d,i ) ( S ( X )) = 2( π ( i,i + d ) ( S ( X )) − π ( i,i ) ( S ( X ))) . as π ( i,i + d ) + π ( i + d,i ) = π ( i ) π ( i + d ) = π ( i ) π ( i ) = π ( i ) (cid:1) ( i ) = 2 π ( i,i ) . For the case that i (cid:54) = i , the signature of the path X ( i ,i ) , which is the ( i , i )coordinate projection of X is given as S ( X ( i ,i ) ) = L (cid:79) n =1 exp (cid:16) x ( i ) n e i + x ( i ) n e i (cid:17) then it follows that π ( i ,i ) S ( X ) = (cid:88) n

The signature of the path X ( i ,i ,i ) , which is the ( i , i , i ) coordinateprojection of X is given as S ( X ( i ,i ,i ) ) = L (cid:79) n =1 exp (cid:16) x ( i ) n e i + x ( i ) n e i + x ( i ) n e i (cid:17) then it follows that π ( i ,i ,i ) S ( X ) = (cid:88) n

Example 6.1

We simulate samples of the pair { ρ n , X ρ n } N =400 n =1 , where ρ n isiid and uniformly distributed in [0 , , and for each ρ n , X ρ n is generated as a 2-dimensional random walk of length L with the correlation ρ n , i.e. x ρ n iid = N (cid:18) , σ (cid:18) ρ n ρ n (cid:19)(cid:19) . How can we estimate the model parameter ρ for each sample path? Our method is simply to do the linear regression of the correlation parameter againstthe truncated signature of the sample path. To better judge the performance of ourmethod, we used the empirical correlation as a benchmark. The empirical correlationfor each sample path X ρ is deﬁned as follows:ˆ ρ = L − (cid:80) n =0 (cid:16) x (1) ρ ( n ) − ¯ x (1) ρ (cid:17) (cid:16) x (2) ρ ( n ) − ¯ x (2) ρ (cid:17)(cid:115) L − (cid:80) n =0 (cid:16) x (1) ρ ( n ) − ¯ x (1) ρ (cid:17) L − (cid:80) n =0 (cid:16) x (2) ρ ( n ) − ¯ x (2) ρ (cid:17) Some parameters I chose are given as follows: L = 120 , N = 200 , d = 3Figure 2 shows that the empirical correlation is better in terms of MSE, especiallywhen ρ is near +1 an −

1. However due to the nature of polynomial regression, thesignature-approach perform worse when ρ is near the boundary. However the reasonwhy the signature approach is not satisfactory is not because that the truncatedsignature do not include enough information of the path. Instead the reason isthat the regression method we used is too simple and it should be combined withadvanced non-linear regression techniques, e.g. rational regression or some localregression methods. Theoretically if properly combined with advanced regression techniques, we should be able to recover the empirical correlation. It is becausethat by deﬁnition of the signature of a stream, Lemma 5.1 shows that the empiricalcovariance/variance of the increment process is a linear combination of the truncatedsignature up to degree 2, and the ratio of the empirical covariance and the square rootof empirical variance of two coordinate increments gives the empirical correlation. Example 6.2

Let X denote a standard 3-dimensional random walk of length L ,and Y denote the other random walk, where y (1) , y (2) are independent and moveto +1 and − with probability . , but y (3) = y (1) y (2) . Given one realization of arandom walk of length L generated either by the distribution of X or that of Y , whichdistribution this realized path is from? In this example, we can’t distinguish which distribution one sample path is gener-ated from by looking at its empirical mean and covariance matrix of the incrementdistribution, it is simply because that E [ x ] = E [ y ] = 0;cov[ x ] = cov[ y ] = I . But we can almost perfectly classify this sample path using the truncated signaturesin this case. We summarize the procedure as follows:1. We simulate N paths based on the distribution of X and Y respectively.2. Compute the truncated signature of those sample paths up to degree d .3. For each sample path X , let the response variable deﬁne in the following way: f ( X ) = (cid:40) X is sampled from X ;0 if X is sampled from Y . . We randomly select half of the dataset as the learning set, and the rest data asthe backtesting set. Apply SVM classiﬁcation method to f ( X ) against S ( X ) d in the learning set, where d = 3.5. After obtaining the classiﬁer ˆ f , for any new given path X ∗ , by plugging it tothe classiﬁer ˆ f , the estimated class of X ∗ is given by ˆ f ( X ∗ ).In this example, we choose N = 200, L = 100 and d = 3. The incorrect selectionratio is 1 / Y is actually the subspaceof the sample space of X , and theoretically if X is in the sample space of X , itscategory is not distinguishable from this sample path trajectory. [1] Guy Flint, Ben Hambly, and Terry Lyons. Discretely sampled signals and therough hoﬀ process. arXiv preprint arXiv:1310.4054 , 2013.[2] Terry Lyons, Thierry L´ e vy, and Michael Caruana. Diﬀerential Equation drivenby Rough Paths . Springer, 2006.. Springer, 2006.