When I compiled the book for
@AtomicChessBot, I processed all monthly atomic databases. This took cca. 5 hours using
github.com/niklasf/python-chess which is in a high level language therefore its game parser is not of high performance ( but of high quality though ).
The trick you can apply, is to filter first the PGNs based only on PGN headers. This can be done very fast, even in a high level language. In my case I only needed games where both players were >2200 rated. This let me get rid of tons of non meaningful games without having to use the python-chess parser. If I had used python-chess for all games, well yes, then it would have taken ages to finish this job.
So try to be smart about your parser and get rid of unwanted games based on PGN headers first.
This is the tool I used to build the atomic book:
github.com/lichapibot/cbuild