N&O Seminar Pengfei Wang (CNRS - LIRMM Montpellier)

Combinatorics of period sets

When
15 Feb 2023 from 11 a.m. to 15 Feb 2023 noon CET (GMT+0100)
Where
L016 and online
Web
Add

Speaker: Pengfei Wang (CNRS-LIRMM Montpellier)
Title: Combinatorics of period sets

Abstract:

When a word \(w\) can be written \(uvu\) for some words \(u,v\), then \(u\) is called a \emph{border} of \(w\) and the length of \(uv\) is a \emph{period} of \(w\). A word can have several borders/periods, in which case smaller borders are nested into longer ones. For example, consider the word \(abracadabra\): then, \(abra\) is

its longest border and corresponds to period 7, but \(a\) also is a border, corresponding to period 10, and finally the empty word is a trivial border corresponding to a period that is the length of \(abracadabra\). Importantly, a period is an offset at which two occurrences of a word \(w\) can overlap themselves.

The notions of borders and periods are key in word combinatorics, in stringology, and especially in pattern matching algorithms. The set of periods of a word impacts how occurrences of this word can appear in

random texts. Consider words of length \(n\). The set of all periods a word of length \(n\) is a subset of \(\{1, 2, \ldots, n\}\). However, any subset of \(\{1, 2, \ldots, n\}\) is not necessarily a valid set of periods. In a seminal paper in 1981, Guibas and Odlyzko have proposed to encode the set of periods of a word into a \(n\) long binary string, called autocorrelation, where a one denotes at position \(i\) a period. They considered the question of recognizing a valid period set, and also studied the number of valid period sets for length \(n\), denoted \(\kappa_n\). They conjectured that \(\ln \kappa_n\) asymptotically converges to a constant times of \(\ln^2 n\). If improved lower bounds for \(\ln( \kappa_n )/{\ln^2(n)}\) were proposed in 2001 the question of a tight upper bound has remained opened since Guibas and Odlyzko's paper.

In this talk, I will present our results----an upper bound for this fraction, which implies its convergence, and closes this long standing conjecture. Moreover, we extend our result to find similar bounds for the number of correlations: a generalization of autocorrelations which encodes the overlaps between two strings.

Joint work with Eric Rivals and Michelle Sweering.

The talk will be given in L016 at CWI. For remote viewers, the talk will be streamed via the link https://cwi-nl.zoom.us/j/87542191297?pwd=QTIxYlNQUmQ5UTFKb0NSaVVsbGFJZz09