How strict is linear regression on the assumption of normality?
Even after transforming my data with the optimum method for normalization (identified by using bestNormalize package in R), my data is still not normal or near normal. Can I still do a linear regression on this not normally distributed data?
Lots of data will never have normal residules because it is inherently not normally distributed data. You should not model this data with a linear regression, but rather use a generalised linear model from a family appropriate for the data to model it. To help you choose what might be an appropriate model, we would need to know what the data is.
The assumption of normality relates to the residuals and not the data itself. I, personally, would not use packages like bestNormalize to tell me how the data looks. Also remember that only fabricated data will have a perfect normal. You could start by plotting a histogram of your data and providing metrics like min, max, covariance, sdev, mean, median, interquartile range, etc.
Lots of data will never have normal residules because it is inherently not normally distributed data. You should not model this data with a linear regression, but rather use a generalised linear model from a family appropriate for the data to model it. To help you choose what might be an appropriate model, we would need to know what the data is.