overall performance is anticipated to get similar or much better than other architectures properly trained on identical data, although not to match larger sized or wonderful-tuned models.
Mamba, like Flash https://k2spiceshop.com/product/liquid-k2-on-paper-online/