We have published the corpus used for the analysis in the paper GitTables: A Large-Scale Corpus of Relational Tables. This corpus consists of 1.7M tables from GitTables, which we refer to as the (GitTables 1.7M) corpus.

The corpus of roughly 25.5 GB is hosted on Zenodo, with DOI 10.5281/zenodo.4943312 and can be found here.