Posted 4 April 2025, 4:34 pm EST - Updated 4 April 2025, 4:39 pm EST
Hi, Having a problem when importing files, we are getting a different character code for the space sometimes when importing a cell with space.
Most of the time we are getting **Standard Space (ASCII 32 / Hex 20) **(represented by this screen shot of Word with the symbols enabled)1)
George
But some times we are encountering this: Non-breaking Space (Unicode U+00A0 / Hex A0): (2) :
-
Standard Space (ASCII 32 / Hex 20): This is the most common representation of a space character. It is used in most text and data processing scenarios.
-
Non-breaking Space (Unicode U+00A0 / Hex A0): This is a special space character that prevents line breaks at its position. It is often used in formatting to keep elements together on the same line.
We’d like all spaces converted to the regular space because it makes it easeir to parse upstream in our system program that runs on C.
Looking into this further… there are about 15 total different types of “space” characters your could run into:
1. Standard Space (ASCII 32 / Hex 20): The regular space character used in most text. 2. Non-breaking Space (Unicode U+00A0 / Hex A0): Prevents line breaks at its position. 3. En Space (Unicode U+2002): A space that is roughly the width of a lowercase "n". 4. Em Space (Unicode U+2003): A space that is roughly the width of a lowercase "m". 5. Three-per-em Space (Unicode U+2004): One-third of an em space. 6. Four-per-em Space (Unicode U+2005): One-fourth of an em space. 7. Six-per-em Space (Unicode U+2006): One-sixth of an em space. 8. Figure Space (Unicode U+2007): The same width as a digit. 9. Punctuation Space (Unicode U+2008): The same width as a period. 10. Thin Space (Unicode U+2009): Thinner than a standard space. 11. Hair Space (Unicode U+200A): Even thinner than a thin space. 12. Zero Width Space (Unicode U+200B): No width, used for word boundaries. 13. Narrow No-break Space (Unicode U+202F): A narrow version of the non-breaking space. 14. Medium Mathematical Space (Unicode U+205F): Used in mathematical notation. 15. Ideographic Space (Unicode U+3000): The width of an ideographic character.
The question is … is there a simple way to convert any space type in to the Standard Space (ASCII 32 / Hex 20) upon import or by selecting the cells? We really *don’t want the non-breaking *variety of space, we need it to be the Standard Space for our parsing down stream.