We already looked at this on icyboards. So the explanation of what happened:
MySQL's UTF-8 isn't actually UTF-8. Instead, they fucked it up and only store 3 bytes per character. this is wrong, a UTF-8 character can be up to 4 bytes. MySQL released a work-around in 2010 called 'utf8mb4' but they never told anyone about it and even to this day most people don't know about it, and it doesn't fix what's already broken. icyboards used utf8, not utf8mb4, as did...pretty much everything.
That missing fourth byte is why things were fucked up- any character that used that fourth byte (like most 'directional' quotes that Word or TextEdit uses, many hyphens, etc) would need converted from their broken 3 byte state (because mysql discarded the fourth byte) to either their correct 4 byte state (assuming this forum doesn't have the same problem) or an 'equivalent' <3 byte character.
We discussed doing this on icyboards- I know how to do that conversion.
It was decided however that it wasn't worth the risk of FUBARing the database- always a risk when doing multiple table-wide find and replaces, and that authors could manually fix their old stuff if they wished (because the idea was raised of having admins/mods fix them manually as well, and that was shot down for 'author's works shouldn't be edited without their consent').
tl;dr: History lesson on the issue.