As a baseline score, we calculate the weighted F1 score of a model that always predicts the most common class of the target column. The score of this model has different values based on the skewedness of the value counts in the target column. For example, the baseline might be 0.2 for one dataset and 0.7 for another. Therefore, it is very important to always compare the F1 score of a new model to the baseline score for this target column because a value of 0.7 might be good for the first dataset but irrelevant for the second.