What happens when Gen AI takes on the role of quality-checking outputs of Gen AI? Are LLM’s more likely to score their own text as the best compared to other base model’s texts?
Share this post
Can AI Judge AI Output?
Share this post
What happens when Gen AI takes on the role of quality-checking outputs of Gen AI? Are LLM’s more likely to score their own text as the best compared to other base model’s texts?