Many studies use p-values to analyze the significance of their claims. P-hacking refers to any method which is used by the researchers (wittingly or unwittingly) to fallaciously lower a p-value in order to make their conclusions seem more significant. This can involve removing “problematic” observations, gathering more data until significance is reached, or performing multiple testing without appropriately adjusting your p-values.
Some of these issues are avoidable with different statistical tools (eg e-values enable optional stopping and continual monitoring)—see issues with p-values. But some can’t be combatted regardless of the sophistication of your statistical tools (removing observations or restarting the study, for instance). If a practitioner is determined to achieve significance, then complicated math is not going to stop them.
Some famous examples of p-hacking: