This page contains a dataset which was compiled with permission from public genealogical profiles uploaded to Geni.com, a MyHeritage company.
Thanks to the commitment of MyHeritage to support scientific research, we were able to release redacted data that contains the basic family tree structures and demographic information. The data does not contain names, or other explicit identifiers of Geni.com users.
See the Data page for citation information.
We collect your contact information only to document the distribution of the data. We will not use this information to contact you for updates or offers or give it to any third party.
Please enter a VALID email address: an email with download instructions will be sent to the provided address..
Once your access is approved, you'll receive an email with link to the dataset archive.
Download this file and extract it on your computer.
Warning: the datasets require large amount of storage (2GB compressed, ~16GB uncompressed).
The archives are compressed using the XZ format. The required programs are commonly available on Unix computers (e.g. Mac OS and Linux). To extract the archives on Windows, download the 7-zip program.
# Download the archive (based on URL given in approval email)
# Be sure to use single-quote characters around the URL
$ wget -O familinx.tar.xz 'https://server.com/familinx.tar.xz?AWSKey=XXXXX'
# If the 'wget' program is not available, use 'curl' instead:
$ curl 'https://server.com/familinx.tar.xz?AWSKey=XXXXX' > familinx.tar.xz
# Optional: verify data integrity.
# You should see the exact same value as printed below:
$ sha256sum familinx.tar.xz a336cd271168ed5dc90efc84d25a12256e767f591a63634d174edfccdc4f1c6a
# Extract the data files
$ tar -xf familinx.tar.xz
# The files are in a 'familinx' subdirectory
$ cd familinx $ ls -lh -rw-r--r-- 1 user users 15G Feb 8 18:10 profiles-anon.txt -rw-r--r-- 1 user users 2.1K Feb 7 16:03 profiles-field-list.txt -rw-r--r-- 1 user users 1004 Feb 8 18:23 README -rw-r--r-- 1 user users 877M Feb 8 12:14 relations-anon.txt -rw-r--r-- 1 user users 259 Feb 8 18:37 sha256sum-anon.txt
# The 'relations-anon.txt' file contains parent/child relations
$ head relations-anon.txt parent child 1002 2044 1002 2045 1004 2045 1006 2046 ...
# The 'profiles-anon.txt' file contains information about each profile
# See here for the complete field list
$ cut -f1,2,3,15,16,17,20,21,22 profiles-anon.txt | head id gender is_alive birth_year birth_city birth_state birth_country 1002 male 0 1917 Cleveland Ohio United States 1005 * 0 1908 * * * 1009 male 0 1942 Philadelphia PA US 1010 female 0 1904 Guxhagen Hesse DE 1011 male 0 1942 * * IN ...