I don't see how it's misleading. MiniGPT-4 makes it sound like a smaller alternative to GPT-4, if it was based on GPT-4 there would be nothing 'mini' about it.
It has more in common with GPT-3 than GPT-4 in terms of size, but in reality it's based on Vicuna/Llama which is 10x smaller than either, so as far as the LLM part of it goes its not mini-anything - it's just straight-up Vicuna 13B.
The model as a whole is just BLIP-2 with a larger linear layer, and using Vicuna as the LLM. If you look at their code it's literally using the entire BLIP-2 encoder (Salesforce code).