I don't know much about Liquidbase, but I believe it doesn't support accessing both the old and the new schema versions at the same time? (I could be wrong here)
Backfilling happens in batches, we use the PK of the table to update all rows, a trigger is also installed so any new insert/update executes the backfill mechanism to update any new column.
The tool is computing the Data pipeline into Apache Spark to rebuild the best way to compute the pipeline and then execute it. No matter how you build your pipeline, Apache will find the best way to compute it. Better performance and efficiency at the same time.
Also, how do you handle the back filling on columns, how you make sure you don't miss any data before dropping the old column?