Chess databases are a great tool for learning chess. You can review and play through historic games. You can also analyze large groups of games for common openings, endgames, etc.

There are a variety of chess databases out there, some free and some for sale. For example, you can buy the Chessbase Big Database which has over 8 million games. Or you can buy the Chessbase Mega Database which has the same number of games, but 85,000 of them are annotated. There are also websites where you can search through chess databases like chessgames.com.

I wanted a copy of a large chess database that I could use to look through for opening reports, searching through historic games, etc.

For my purposes, I did not want to spend more money on a database at this time. I may change my mind in the future, in which case I’ll buy one of the Chessbase databases, but for now I was just looking for a large free collection of games.

Original Post and Caissabase

In 2021 I wrote about how to get a free large chess database using Caissabase. That continues to be one of my most popular posts. However, it looks like Caissabase has gone away, so I needed to find a new large database.

As of the time of writing, you could find a version from 2024 on the en Croissant database.

Lumbra’s Gigabase

The new best and largest free database I could now find was Lumbra’s Gigabase. This is a great resource. The games are not annotated like you would get with the Mega database, but there is a huge number of curated games. There are no games with less than 5 moves, the goal is to have games at least master strength, and there are no duplicates (as much as possible with 5 million games).

Please support the creators of these databases to help keep them free and available for everyone! Go buy the owner a coffee.

As of the latest version I have (updated in June 2025), here are some statistics:

  • 9.6 Million Games - Actually there were about 10 million games that I downloaded but there were around 400k duplicates. I found the duplicates in Chessbase. I know the original version also tried to remove duplicates, but different methods find different types of duplicates I’m sure.

  • The latest games are from June 2025 (the day before I downloaded it)

  • The earliest game is from a game in Rome in the year 1610, Giulio Cesare Polerio vs Domenico.

Downloading - SCID

If you are using SCID, then you want to download the SCID versions of the databases, depending on the version of SCID you have.

I don’t really use SCID, so I didn’t play with this too much.

Downloading - PGN and Chessbase

If you are going to use Chessabse, the process is a bit more complex. Option 1 is you could download the SCID files, open them in SCID and then export a PGN. Probably not a great approach unless you use both SCID and Chessbase.

Option 2, is what I did. That is to download the PGN files directly. There are currently 11 different PGN files based on when the game was played from “0-1899” all the way to “2025”. I only downloaded the OTB games and it was 7.2 GB of PGNs. The good thing is that once you do this then you won’t have to download all of them again, you can just download the latest updates or current year.

Importing into Chessbase

Here is what I did to get these into Chessbase. If you know of a better way, please let me know!

  • Create a new database

  • Open the database

  • Choose Paste > Append Games - repeat this for each file you downloaded

This will take some time, but you are importing millions of games. It took me about an hour to get them all loaded, but that could be my slow computer. Each file took 5-10 minutes depending on the number of games.

There are other ideas you could try, like using the terminal and the pgn2cbh utility, but since I was only doing this once I didn’t bother.

Database Cleanup

There are a few things that I did once I had them all imported:

  • Maintenance > Find Double Games (Kill Doubles): This took 5-10 minutes and found 400k repeated games that were marked for deletion logseq.order-list-type:: number

  • Maintenance > Remove Deleted Games (Pack Database): This took about 20 minutes. I was interested in removing the double games I just deleted, but this (I think) does a bit more than that. It removes deleted games but also compacts the database files. Basically a “defrag” command. logseq.order-list-type:: number

  • (Optional) Maintenance > Improve: This took hours and I just let it run overnight so I’m not sure exactly how long. This fixes the tournament spelling and data. I’m not sure if I really needed to do this. logseq.order-list-type:: number

Now I can use this reference database for studying openings or reviewing historic games. I hope this database will continue to be updated so I can keep getting new games.