Deep generative models such as normalizing flows are suggested as alternatives to standard methods for generating lattice gauge field configurations. Previous studies on normalizing flows demonstrate proof of principle for simple models in two dimensions. However, further studies indicate that the training cost can be, in general, very high for large lattices. The poor scaling traits of current models indicate that moderate-size networks cannot efficiently handle the inherently multi-scale aspects of the problem, especially around critical points. In this talk, we explore current models that lead to poor acceptance rates for large lattices and explain how to use effective field theories as a guide to design models with improved scaling costs. Finally, we discuss alternative ways of handling poor acceptance rates for large lattices.