I had a case where I have a PGN file that has multiple varitions. I wanted to split out each variations into a separate PGN file that I could then load into Lichess or Chessbase. Here is what I did.

Thanks to this Reddit post for setting me on the right path.

PGN Extract

I started by writing a script in Python using the python chess PGN library. However, the traversal of variations is a little complicated.

Then I found a tool that did exactly what I wanted to do called PGN Extract. To get it installed on a mac/linux I followed these steps:

  1. Go to the [webpage] and download either the .zip or .tgz of the latest version of the file
  2. Extract that into a folder
  3. Run make

If all goes well, you will have an executable called pgn-extract.

Splitting PGNs

Once I had the executable, I was ready to go.

./pgn-extract MyMainPGN.pgn --splitvariants --output Split.pgn
./pgn-extract Split.pgn -#1

The first command will create a single Split.pgn that has the separated variations as different games. This might be enough.

The second command creates a separate file for each game. They are numbered 1.pgn, 2.pgn, etc.

PGN Extract Help

Since I couldn’t find it anywhere else on the web, here is the output of the --help command with the version I built:

pgn-extract v21-02 (Nov 16 2021): a Portable Game Notation (PGN) manipulator.
Copyright (C) 1994-2021 David J. Barnes (d.j.barnes@kent.ac.uk)
https://www.cs.kent.ac.uk/people/staff/djb/pgn-extract/

Usage: pgn-extract [arguments] [file.pgn ...]
-7 -- output only the seven tag roster for each game. Other tags (apart
      from FEN and possibly ECO) are discarded (See -e).
-#num[,num] -- output num games per file, to files named 1.pgn, 2.pgn, etc.

-aoutputfile -- append extracted games to outputfile. (See -o).
-Aargsfile -- read the program's arguments from argsfile.
-b[elu]num -- restricted bounds on the number of moves in a game.
       lnum set a lower bound of 'num' moves,
       unum set an upper bound of 'num' moves,
       otherwise num (or enum) means equal-to 'num' moves.
-cfile[.pgn] -- Use file.pgn as a check-file for duplicates or
      contents of file (no pgn suffix) as a list of check-file names.
-C -- don't include comments in the output. Ordinarily these are retained.
-dduplicates -- write duplicate games to the file duplicates.
-D -- don't output duplicate games.
-eECO_file -- perform ECO classification of games. The optional
      ECO_file should contain a PGN format list of ECO lines
      Default is to use eco.pgn from the current directory.
-E[123 etc.] -- split output into separate files according to ECO.
      E1 : Produce files from ECO letter, A.pgn, B.pgn, ...
      E2 : Produce files from ECO letter and first digit, A0.pgn, ...
      E3 : Produce files from full ECO code, A00.pgn, A01.pgn, ...
      Further digits may be used to produce non-standard further
      refined division of games.
      All files are opened in append mode.
-F[text] -- output a FEN string comment after the final (or other selected) move.
-ffile_list  -- file_list contains the list of PGN source files, one per line.
-Hhash -- match games in which the given Zobrist polyglot hash value occurs
-h -- print details of the arguments.
-llogfile  -- Save the diagnostics in logfile rather than using stderr.
-Llogfile  -- Append all diagnostics to logfile, rather than overwriting.
-M -- Match only games which end in checkmate.
-noutputfile -- Write all valid games not otherwise output to outputfile.
-N -- don't include NAGs in the output. Ordinarily these are retained.
-ooutputfile -- write extracted games to outputfile (existing contents lost).
-p[elu]num -- restricted bounds on the number of ply in a game.
       lnum set a lower bound of 'num' ply,
       unum set an upper bound of 'num' ply,
       otherwise num (or enum) means equal-to 'num' ply.
-P -- don't match permutations of the textual variations (-v).
-Rtagorder -- Use the tag ordering specified in the file tagorder.
-r -- report any errors but don't extract.
-S -- Use a simple soundex algorithm for some tag matches. If used
      this option must precede the -t or -T options.
-s -- silent mode: don't report each game as it is extracted.
-ttagfile -- file of player, date, result or FEN extraction criteria.
-Tcriterion -- player, date, eco code, hashcode, FEN position, annotator or result, extraction criteria.
-U -- don't output games that only occur once. (See -d).
-vvariations -- the file variations contains the textual lines of interest.
-V -- don't include variations in the output. Ordinarily these are retained.
-wwidth -- set width as an approximate line width for output.
-W[cm|epd|halg|lalg|elalg|xlalg|xolalg|san] -- specify the output format to use.
      Default is SAN.
      -W means use the input format.
      -Wcm is (a possibly obsolete) ChessMaster format.
      -Wepd is EPD format.
      -Wsan[PNBRQK] for language specific output.
      -Whalg is hyphenated long algebraic.
      -Wlalg is long algebraic.
      -Welalg is enhanced long algebraic.
      -Wxlalg is enhanced long algebraic with x for captures and - for non capture moves.
      -Wxolalg is -Wxlalg but with O-O and O-O-O for castling.
      -Wuci is output compatible with the UCI protocol.
-xvariations -- the file variations contains the lines resulting in
                positions of interest.
-yfile -- file contains a material balance of interest.
-zfile -- file contains a material balance of interest.
-Z -- use the file virtual.tmp as an external hash table for duplicates.
      Use when MallocOrDie messages occur with big datasets.

--addhashcode - output a HashCode tag
--addlabeltag - output a MatchLabel tag with FENPattern
--addmatchtag - output a MaterialMatch tag with -z
--allownullmoves - allow NULL moves in the main line
--append - see -a
--btm - match position only if Black is to move (see -t)
--checkfile - see -c
--checkmate - see -M
--commentlines - output each comment on a separate line
--dropbefore - drop opening ply before a matching comment string
--dropply - drop the given number of ply from the beginning of the game
--duplicates - see -d
--evaluation - include a position evaluation after each move
--fencomments - include a FEN string after each move
--fenpattern pattern - match games reaching a position matching the given FEN pattern
--fenpatterni pattern - match games reaching a position matching the given FEN pattern for either side
--fifty - only output games that include fifty moves with no capture or pawn move.
--fixresulttags - correct Result tags that conflict with the game outcome or terminating result.
--fixtagstrings - attempt to correct tag strings that are not properly terminated.
--fuzzydepth plies - positional duplicates match
--hashcomments - include a hashcode string after each move
--help - see -h
--json - output the game in JSON format
--keepbroken - retain games with errors
--linelength - see -w
--linenumbers marker - include a comment with the source line numbers of each game { marker:start:end }
--matchplylimit - maximum ply depth to search for positional matches
--markmatches - mark positional and material matches with a comment; see -t, -v, and -z
--materialy material - material is a string describing a material balance; see -y--materialz material - material is a string describing a material balance; see -z--nestedcomments - allow nested comments
--nobadresults - reject games with inconsistent result indications.
--nochecks - don't output + and # after moves.
--nocomments - see -C
--noduplicates - see -D
--nofauxep - don't output ep squares in FEN when the capture is not possible
--nomovenumbers - don't output move numbers.
--nonags - see -N
--noresults - don't output results.
--nosetuptags - don't match games with a SetUp tag.
--notags - don't output any tags.
--nounique - see -U
--novars - see -V
--onlysetuptags - only match games with a SetUp tag.
--output - see -o
--plycount - include a PlyCount tag.
--plylimit - limit the number of plies output.
--quiescent N - position quiescence length (default 0)
--quiet - No status processing output (see, also, -s).
--repetition - only output games that include 3-fold repetition.
--selectonly range[,range ...] - only output the selected matched game(s)
--seven - see -7
--skipmatching range[,range ...] - don't output the selected matched game(s)
--splitvariants [depth] - output each variation (to the given depth) as a separate game.
--stalemate - only output games that end in stalemate.
--startply N - only start matching after N ply (N >= 1).
--stopafter N - stop after matching N games (N > 0)
--tagsubstr - match in any part of a tag (see -T and -t).
--totalplycount - include a tag with the total number of plies in a game.
--underpromotion - match only games that contain an underpromotion.
--version - print the current version number and exit.
--wtm - match position only if White is to move (see -t)
--xroster - don't output tags not included with the -R option (see -R).