Sean Simmons
Massachusetts Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sean Simmons.
Cell systems | 2016
Sean Simmons; Cenk Sahinalp; Bonnie Berger
The proliferation of large genomic databases offers the potential to perform increasingly larger-scale genome-wide association studies (GWASs). Due to privacy concerns, however, access to these data is limited, greatly reducing their usefulness for research. Here, we introduce a computational framework for performing GWASs that adapts principles of differential privacy-a cryptographic theory that facilitates secure analysis of sensitive data-to both protect private phenotype information (e.g., disease status) and correct for population stratification. This framework enables us to produce privacy-preserving GWAS results based on EIGENSTRAT and linear mixed model (LMM)-based statistics, both of which correct for population stratification. We test our differentially private statistics, PrivSTRAT and PrivLMM, on simulated and real GWAS datasets and find they are able to protect privacy while returning meaningful results. Our framework can be used to securely query private genomic datasets to discover which specific genomic alterations may be associated with a disease, thus increasing the availability of these valuable datasets.
Bioinformatics | 2016
Sean Simmons; Bonnie Berger
Motivation: As genomics moves into the clinic, there has been much interest in using this medical data for research. At the same time the use of such data raises many privacy concerns. These circumstances have led to the development of various methods to perform genome-wide association studies (GWAS) on patient records while ensuring privacy. In particular, there has been growing interest in applying differentially private techniques to this challenge. Unfortunately, up until now all methods for finding high scoring SNPs in a differentially private manner have had major drawbacks in terms of either accuracy or computational efficiency. Results: Here we overcome these limitations with a substantially modified version of the neighbor distance method for performing differentially private GWAS, and thus are able to produce a more viable mechanism. Specifically, we use input perturbation and an adaptive boundary method to overcome accuracy issues. We also design and implement a convex analysis based algorithm to calculate the neighbor distance for each SNP in constant time, overcoming the major computational bottleneck in the neighbor distance method. It is our hope that methods such as ours will pave the way for more widespread use of patient data in biomedical research. Availability and implementation: A python implementation is available at http://groups.csail.mit.edu/cb/DiffPriv/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.
Acta Informatica | 2011
Francine Blanchet-Sadri; Robert Mercaş; Sean Simmons; Eric Weissenstein
The problem of classifying all the avoidable binary patterns in (full) words has been completely solved (see Chap. 3 of M. Lothaire, Algebraic Combinatorics on Words, Cambridge University Press, 2002). In this paper, we classify all the avoidable binary patterns in partial words, or sequences that may have some undefined positions called holes. In particular we show that, if we do not substitute any variable of the pattern by a partial word consisting of only one hole, the avoidability index of the pattern remains the same as in the full word case.
Cryptologia | 2009
Sean Simmons
Abstract Simplified AES was developed in 2003, as a teaching tool to help students understand AES. It was designed so that the two primary attacks on symmetric-key block ciphers of that time, differential cryptanalysis and linear cryptanalysis, are not trivial on simplified AES. Algebraic cryptanalysis is a technique that uses modern equation solvers to attack cryptographic algorithms. We will use algebraic cryptanalysis to attack simplified AES.
Journal of Combinatorial Theory | 2012
Francine Blanchet-Sadri; Jane I. Kim; Robert Mercaş; William Severa; Sean Simmons; Dimin Xu
Erdos raised the question whether there exist infinite abelian square-free words over a given alphabet, that is, words in which no two adjacent subwords are permutations of each other. It can easily be checked that no such word exists over a three-letter alphabet. However, infinite abelian square-free words have been constructed over alphabets of sizes as small as four. In this paper, we investigate the problem of avoiding abelian squares in partial words, or sequences that may contain some holes. In particular, we give lower and upper bounds for the number of letters needed to construct infinite abelian square-free partial words with finitely or infinitely many holes. Several of our constructions are based on iterating morphisms. In the case of one hole, we prove that the minimal alphabet size is four, while in the case of more than one hole, we prove that it is five. We also investigate the number of partial words of length n with a fixed number of holes over a five-letter alphabet that avoid abelian squares and show that this number grows exponentially with n.
language and automata theory and applications | 2010
Francine Blanchet-Sadri; Jane I. Kim; Robert Mercas; William Severa; Sean Simmons
Erdos raised the question whether there exist infinite abelian square-free words over a given alphabet (words in which no two adjacent subwords are permutations of each other). Infinite abelian square-free words have been constructed over alphabets of sizes as small as four. In this paper, we investigate the problem of avoiding abelian squares in partial words (sequences that may contain some holes). In particular, we give lower and upper bounds for the number of letters needed to construct infinite abelian square-free partial words with finitely or infinitely many holes. In the case of one hole, we prove that the minimal alphabet size is four, while in the case of more than one hole, we prove that it is five.
Theoretical Informatics and Applications | 2013
Francine Blanchet-Sadri; Sean Simmons; Amelia Tebbe; Amy Veprauskas
Recently, Constantinescu and Ilie proved a variant of the wellknown periodicity theorem of Fine and Wilf in the case of two relatively prime abelian periods and conjectured a result for the case of two nonrelatively prime abelian periods. In this paper, we answer some open problems they suggested. We show that their conjecture is false but we give bounds, that depend on the two abelian periods, such that the conjecture is true for all words having length at least those bounds and show that some of them are optimal. We also extend their study to the context of partial words, giving optimal lengths and describing an algorithm for constructing optimal words.
developments in language theory | 2011
Francine Blanchet-Sadri; Sean Simmons
We study abelian repetitions in partial words, or sequences that may contain some unknown positions or holes. First, we look at the avoidance of abelian pth powers in infinite partial words, where p > 2, extending recent results regarding the case where p = 2. We investigate, for a given p, the smallest alphabet size needed to construct an infinite partial word with finitely or infinitely many holes that avoids abelian pth powers. We construct in particular an infinite binary partial word with infinitely many holes that avoids 6th powers. Then we show, in a number of cases, that the number of abelian p-free partial words of length n with h holes over a given alphabet grows exponentially as n increases. Finally, we prove that we cannot avoid abelian pth powers under arbitrary insertion of holes in an infinite word.
Theoretical Informatics and Applications | 2014
Francine Blanchet-Sadri; Benjamin De Winkle; Sean Simmons
Pattern avoidance is an important topic in combinatorics on words which dates back to the beginning of the twentieth century when Thue constructed an infinite word over a ternary alphabet that avoids squares, i.e. , a word with no two adjacent identical factors. This result finds applications in various algebraic contexts where more general patterns than squares are considered. On the other hand, Erdős raised the question as to whether there exists an infinite word that avoids abelian squares, i.e. , a word with no two adjacent factors being permutations of one another. Although this question was answered affirmately years later, knowledge of abelian pattern avoidance is rather limited. Recently, (abelian) pattern avoidance was initiated in the more general framework of partial words, which allow for undefined positions called holes. In this paper, we show that any pattern p with n > 3 distinct variables of length at least 2 n is abelian avoidable by a partial word with infinitely many holes, the bound on the length of p being tight. We complete the classification of all the binary and ternary patterns with respect to non-trivial abelian avoidability, in which no variable can be substituted by only one hole. We also investigate the abelian avoidability indices of the binary and ternary patterns.
Acta Informatica | 2012
Francine Blanchet-Sadri; Robert Mercaş; Sean Simmons; Eric Weissenstein
In the paper “F. Blanchet-Sadri, R. Mercas, S. Simmons and E. Weissenstein, Avoidable binary patterns in partial words, Acta Informatica 48(1) (2011) 25–41”, the first sentence of Section 5 as well as Theorem 6 should be deleted. The error is that overlap-freeness does not necessarily coincide with the pattern αβαβα being avoidable. It coincides only in the case when the considered overlaps have length greater than three. Thus, an infinite word might contain factors of the form aaa while still avoiding the pattern αβαβα, where a is a letter of the alphabet the word is defined on. Thus, given the fact that the pattern αβαβα is 3-avoidable and, obviously, this pattern is not 1-avoidable, it remains open the question of whether αβαβα is 2-avoidable in partial words. Consequently, the pattern αβαβα should be removed from Statement 2 of Theorem 10 and a new statement should be added stating that: