This is more of a general information post in regards to corrupted posts, and can be ignored.
daniel_gudman said:
This is basically unrelated, but during the software update one of my threads had completely destroyed single quotes, double quotes, N-dashes, and M-dashes; they got replaced with four-character nonsense, like "alpha, amperstand, epsilon, tab", so I copypastad into MS Word to use find/replace and then copypasta'd back (because if there's find/replace inside the forum text editor I don't know about it). (Also I don't know if any of that detail is relevant but I'm definitely a "computer user" not a "programmer" so that might be meaningful or not IDK).
As a "computer user" to use your own words, I do not know if you would care to know, but the 4 digit coding with the ampersand is the behind the scenes HTML encoding method for Unicode.
Generally, a "computer user" should never see that unless they make it themselves. Those codes exist for every character, normally as hex and decimal and sometimes a abbreviation name, but most characters, particularly numbers and letters, never need it. The ampersand encoding came about as a standard way for the extended characters to be represented.
The general upside, when everything works, is a Unicode aware site or a Unicode aware program should never show those. BUT, an advantage is they generally accept them in post boxes.
This allows for neat things to be done if you know the code for what you want to show. Use the code instead of hunting down how to get to it through more conventional means.
Take a simple example, the hyphen itself (do keep in mind there are multiple hyphens, each with a different code):
Code:
normal typing: -
decimal code (spaces added by me): & # 8 2 0 8 ;
hexadecimal code (spaces added by me): & # x 2 0 1 0 ;
The upside, if something is corrupted, and you see the codes, you merely need to look up the codes and do a search/replace to restore the original look...
Here is a Unicode conversion link:
- [Direct] http://www.fileformat.info/info/unicode/char/a.htm
EDIT: Interesting ... this forum converted the decimal code in a code block... I really would not have expected something like that since a code block is supposed to be unprocessed.